Updates from BestWorld’s AI Botmaster Jeremy Lichtman of Oct. 16, 2024

Guide to Bestworldbot’s Reports

Median from the bestworldbots: We collect the output values from the LLMs in an array. We then take the median value. This is between 0 and 1 (i.e. 0.15 = 15%).
Base rate: Currently we take OpenAI’s value as the base rate. We are working on a better value using a different prompt that takes into account process. The base rate helps to determine if the median makes sense.
SD: The standard deviation between the median and the base rate.
Confidence: We query each of the LLMs on how confident they are of their predictions (between 0 and 10) and take a median. Because the LLMs tend to be overconfident, we take anything lower than 6 as being low confidence. This factors into the overall model.
Conf Mode: Based on the confidence value. >=9 is high confidence. Below 6 is low confidence (this is also triggered by an exceptionally high SD).
Mellers: This refers to Barbara Mellers, specifically a paper she wrote that includes a formula for moving values towards an extreme (i.e. 0 or 1).
Reverse Mellers: This uses the formula from above, but with a sub-1 coefficient to move the values closer to 50%.
Theory of Mind: We ask the LLMs what they think other LLMs would predict. We hope that this makes them consider the questions more deeply.
Beta Distribution: Currently unused, but possibly of interest. This is based on the median, the base rate and the SD.
Close Type: We noticed that the appropriate base case for some questions is closer to the extremes, while others are closer to 50%. When we have a low confidence value, this helps us to determine whether to extremize or de-extremize the value. ‘A’ implies closer to zero. ‘B’ implies closer to 50%. ‘C’ implies closer to 100%.
# LLM Responses: The count of LLMs that responded. We query 5 currently, but they have a relatively high API failure rate.
Model value: For normal confidence mode, this is the median. For high confidence, this is the Mellers value. For low confidence, this is either the Mellers or Reverse Mellers (depending on the Close Type).

What’s the probability of the US Steel/Nippon Steel merger being officially announced before January 21, 2025?

===============================
Welcome to the Multi-AI Oracle.
===============================
What is your question? What’s the probability of the US Steel/Nippon Steel merger being officially announced before January 21, 2025? Do you have any other information you wish to pass along to assist in answering? (Hit Enter to skip)
ID: N/A

Obtaining outside newsfeed. 
Summarizing outside newsfeed:

“The potential merger between US Steel and Nippon Steel is facing significant hurdles, largely due to regulatory and political challenges. Nippon Steel has proposed a $14.9 billion bid to acquire US Steel, but the proposal is under scrutiny by the Biden administration and has encountered resistance from labor unions concerned about the deal’s impact on jobs and national security. The Committee on Foreign Investment in the United States (CFIUS) has advised against the merger, citing possible implications for US steel tariffs and national security. The United Steelworkers union has also voiced opposition, adding further complexity to the merger’s potential approval. The uncertainty around the merger has led to fluctuations in US Steel’s stock price, with a reported near-20% drop amid fears of regulatory impediments. A final decision on the merger is anticipated after the US presidential election in November 2024, which may influence the timeline for any official announcements.”

Querying Perplexity (AI predicts: 0.6 – confidence: 6)
Querying Claude (AI predicts: 0.15 – confidence: 6)
Querying Mistral (AI predicts: 0.3 – confidence: 6)
Querying Cohere (AI predicts: 0.2 – confidence: 6)
Querying OpenAI (AI predicts: 0.15 – confidence: 5)

Median from LLMs: 0.2
Base rate: 0.15 (from OpenAI)

SD: 0.17
Confidence: 6
Conf Mode: Low
Mellers: 0.08
Reverse Mellers: 0.33
Theory of Mind: 0.3 (What did the LLMs think other LLMs predicted?)
Beta Distribution: 0.28
Close Type: A (B = cautious # closer to 50%; A/C = closer to extremes)
# LLM responses: 5
Model value (not submitted): 0.08.

We queried Perplexity, Claude, Mistral, Cohere, OpenAI. A summary of what the responding AIs considered is as follows:

The proposed merger between US Steel and Nippon Steel faces numerous regulatory and political challenges, including significant scrutiny from the Biden administration, opposition from labor unions, and an advisory against the merger by the Committee on Foreign Investment in the United States (CFIUS) due to national security and economic concerns. These hurdles have created uncertainty, reflected in the near-20% drop in US Steel’s stock price. The timeline for any final decision is expected to extend beyond the US presidential election in November 2024, suggesting that an official announcement may not occur before January 21, 2025. While the companies have expressed confidence in the merger, unforeseen political or economic shifts post-election could accelerate the decision-making process if the regulatory concerns are addressed more quickly than anticipated or if the political climate becomes more favorable.
Runtime: 173 seconds.

Bestworldbot News

This week bestworldbot continues to make practice forecasts on the Q4 AI Forecasting Benchmark Tournament. Click here to see them.

The Q3 AI Forecasting Benchmark Competition has concluded.

Bestworldbot was #53 out of 55 competitors. A likely explanation for bestworldbot’s huge fall on the leaderboard from #17 on Sept. 10 is that we tried extremizing its forecasts. With most human forecasters, extremizing is usually effective. However, we now believe that extremizing made bestworldbot worse. Oh, well, that’s research, some things work, some don’t.

Given the agreement of the Dock Workers to salary increases, what’s the probability of a strike in Q1 2025?

===============================
Welcome to the Multi-AI Oracle.
===============================
What is your question? Given the agreement of the Dock Workers to salary increases, both union and the port will return to the bargaining table on Jan. 15, 2025 to discuss automation and other issues. As the dock workers have only walked out once since 1977, what’s the probability of a strike in Q1 2025.

Do you have any other information you wish to pass along to assist in answering? (Hit Enter to skip)
ID: N/A

Obtaining outside newsfeed. Summarizing outside newsfeed:

The International Longshoremen’s Association (ILA) has reached a tentative agreement that pauses a strike, allowing dock workers to return to work while negotiations persist until January 15, 2025. These negotiations will focus on critical issues such as wages and the increased automation of tasks, which raises job security concerns among workers. Historically, dock worker strikes have been infrequent, with the last walkout occurring in 1977, which might suggest a moderated likelihood of another strike in the near term, regardless of ongoing grievances. There are warnings from experts that any potential future strike could significantly impact the economy, particularly affecting prices and supply chains, which would be especially concerning during the holiday season. However, the current agreement may mitigate immediate disruption risks.”

Querying Perplexity (AI predicts: 0.2 – confidence: 6)
Querying Claude (AI predicts: 0.15 – confidence: 6)
Querying Mistral (AI predicts: 0.25 – confidence: 7)
Querying Cohere (AI predicts: 0.4 – confidence: 5)
Querying OpenAI (AI predicts: 0.15 – confidence: 6)

Median from LLMs: 0.2
Base rate: 0.15 (from OpenAI)

SD: 0.09
Confidence: 6
Conf Mode: Normal
Mellers: 0.08
Reverse Mellers: 0.33
Theory of Mind: 0.3 (What did the LLMs think other LLMs predicted?)
Beta Distribution: 0.25
Close Type: A (B = cautious # closer to 50%; A/C = closer to extremes)
# LLM responses: 5
Model value (not submitted): 0.2

We queried Perplexity, Claude, Mistral, Cohere, OpenAI. A summary of what the responding AIs considered is as follows:

The prediction regarding the likelihood of a dock worker strike is mainly based on the historical rarity of strikes, with the last one occurring in 1977, and the recent tentative agreement among workers, which pauses potential strike actions until January 15, 2025. While the agreement suggests a cooperative approach to resolving issues such as wages and the contentious topic of automation, underlying tensions remain. Economic pressures, potential breakdowns in negotiations, and changes in labor conditions could still lead to a strike. Although the immediate disruption risk is mitigated by the current agreement and ongoing negotiations, there is a concern that unresolved issues, especially related to job security from increased automation, could escalate tensions. Experts caution that any future strike could have significant economic impacts, particularly affecting prices and supply chains.
Runtime: 112 seconds.

Bots vs Staff Forecasts

BestWorld’s staff of Brian LaBatte and Carolyn Meinel are forecasting the US Steel/Nippon Steel Merger in competition with Jeremy’s Multri-AI Oracle. Their version has two sub-questions:

(1) Will the US Steel / Nippon Steel merger be approved by the White House before Jan. 21. 2025?” Brian’s latest forecast is 80%; Carolyn’s is 70%. Michal’s is 20%.

(2) If yes, when will US Steel investors relations announce the final outcome of the acquisition with Nippon Steel?

a. Before November 5, 2024
b. Between November 6 and December 31, 2024
c. January 1 to 20, 2024
d. After January 21, 2024
Brian’s latest forecasts are 0%, 20%, 20%, and 60%.
Carolyn’s are 0%, 20%, 20%, and 60%.
Below are their forecasting spreadsheets.

If yes, when will US Steel’s investors relations announce the results?


Previous bots vs human forecasts:

Oct. 15, 2024
Oct. 14, 2024
Oct. 11, 2024
Oct. 10, 2024
Oct. 9, 2024
Oct. 8, 2024
Oct. 7, 2024
Oct. 3, 2024
Oct. 2, 2024
Oct. 1, 2024
Sept. 30, 2024
Sept. 27, 2024
Sept. 26, 2024
Sept. 25, 2024
Sept. 24, 2024
Sept. 23, 2024
Sept. 18, 2024
Sept. 17, 2024
Sept. 16, 2024
Sept. 13, 2024
Sept. 12, 2024
Sept. 11, 2024
Sept. 10, 2024
Sept. 9, 2024
Sept. 6, 2024
Sept. 5, 2024
Sept. 4, 2024
Sept. 3, 2024
Sept. 2, 2024
Aug. 30, 2024
Aug. 29, 2024
Aug. 28, 2024
Aug. 27, 2024
Aug. 26, 2024
Aug. 23, 2024
Aug. 22, 2024
Aug. 21, 2024
Aug. 20, 2024
Aug. 19, 2024
Aug. 16, 2024
Aug. 15, 2024
Aug. 14, 2024
Aug. 13, 2024
Aug. 12, 2024
Aug. 9, 2024
Aug. 8, 2024
Aug. 7, 2024
Aug. 6, 2024
Aug. 5. 2024
Aug. 2, 2024
Aug. 1, 2024
July 31, 2024
July 30, 2024
July 29, 2024
July 26, 2024
July 25, 2024

More about our previous staff vs bot competition here.

Our bots vs humans experiments

Our concluded FED rate increase competition.

All our current bot projects.

Author