Jeremy Lichtman’s Multi-AI Oracle, comprising Claude, Mistral, and OpenAI
Will there be a ceasefire declared between Israel and Hamas in the month of August 2025? On Aug. 27, his Multi-AI Oracle predicted 35%
Details here —>
How many state-based conflict deaths in Sudan will be reported by ACLED for 2025?
On Aug. 25, 2025, his Multi-AI Oracle predicted:
Less than 1,000: 1%
Between 1,000 and 3,000: 5%
Between 3,000 and 5,000: 16%
Between 5,000 and 8,000: 26%
Between 8,000 and 12,000: 36%
More than 12,000: 16%
Details here —>
How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in August 2025? On Aug. 21, 2025, his Multi-AI Oracle predicted:
Less than 500: 10%
Between 500 and 1,000: 20%
Between 1,000 and 1,500: 30%
Between 1,500 and 2,000: 25%
Greater than 2,000: 15%
Details here —>
Will there be a ceasefire declared between Israel and Hamas in the month of August 2025? Aug. 20, 2025, his Multi-AI Oracle predicted: 35%
Details here —>
Will hostilities between Pakistan and India result in at least 100 total uniformed casualties (with at least one death) between 2 June 2025 and 30 September 2025? On Aug. 19, 2025, his Multi-AI Oracle predicted 20%
Details here —>
How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in August 2025? On Aug. 14, 2025, his Multi-AI Oracle predicted:
Less than 500: 3%
Between 500 and 1,000: 10%
Between 1,000 and 1,500: 23%
Between 1,500 and 2,000: 30%
Greater than 2,000: 34%
Details here —>
Will hostilities between Pakistan and India result in at least 100 total uniformed casualties (with at least one death) between 2 June 2025 and 30 September 2025? Aug. 12, 2025, his Multi-AI Oracle predicted 25%
Details here —>
How many state-based conflict deaths in Sudan will be reported by ACLED for 2025? His Multi-AI Oracle predicted on Aug. 11, 2025:
Less than 1,000: 2%
Between 1,000 and 3,000: 8%
Between 3,000 and 5,000: 16%
Between 5,000 and 8,000: 26%
Between 8,000 and 12,000: 32%
More than 12,000: 15%
Details here —>
How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in August 2025? His Multi-AI Oracle predicted on Aug. 7, 2025:
Less than 500: 3%
Between 500 and 1000: 8%
Between 1000 and 1500: 15%
Between 1500 and 2000: 24%
Greater than 2000: 50%
Details here —>
Below you can see the final Metaculus AI Benchmark Tournament Q2 Leaderboard. Phillip Godzin’s pgozinai, highlighted in light orange, is #3! Colors code for previous holders of first place, excepting jbot, in yellow, belonging to Jeremy Lichtman.
Phillip Godzin’s pgodzinai bot, comprising Perplexity, Grok, AskNews Deep Search, GPT, Anthropic, and Gemini
Will there be a ceasefire declared between Israel and Hamas in the month of August 2025? On Aug. 27, his pgodzinai predicted 3%
Details here —>
How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in August, 2025? Aug. 26, 2025, his pgodzinai predicted:
Less than 500: 2%
Between 500 and 1,000: 15%
Between 1,000 and 1,500: 55%
Between 1,500 and 2,000: 20%
Greater than 2,000: 8%
Details here —>
Will hostilities between Pakistan and India result in at least 100 total uniformed casualties (with at least one death) between 2 June 2025 and 30 September 2025?
His pgodzinai bot predicts 20%
Details here —>
How many state-based conflict deaths in Sudan will be reported by ACLED in 2025?
On Aug. 22, his pgodzinai bot predicted:
Less than 1,000: 1%
Between 1,000 and 3,000: 1%
Between 3,000 and 5,000: 8%
Between 5,000 and 8,000: 25%
Between 8,000 and 12,000: 35%
More than 12,000: 30%
Details here —>
How many state-based conflict deaths in Syria will be reported by ACLED for the month of August, 2025? Aug. 21, 2025, his pgodzinai bot predicted:
Less than 100: 8%
Between 100 and 250: 52%
Between 250 and 500: 28%
Between 500 and 1,000: 10%
Greater than 1,000: 2%
Details here —>
Will there be a ceasefire declared between Israel and Hamas in the month of August 2025? On Aug. 20, 2025, his pgodzinai bot predicted: 38%
Details here —>
How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in August, 2025? On Aug. 19, 2025, his pgodzinai bot predicted:
Less than 500: 2%
Between 500 and 1,000: 5%
Between 1,000 and 1,500: 10%
Between 1,500 and 2,000: 18%
Greater than 2,000: 65%
Details here —>
Will hostilities between Pakistan and India result in at least 100 total uniformed casualties (with at least one death) between 2 June 2025 and 30 September 2025? On Aug. 18 2025, his pgodzinai bot predicted 20%
Details here —>
How many state-based conflict deaths in Sudan will be reported by ACLED in 2025?
On Aug. 15, 2025, his pgodzinai bot predicted:
Less than 1,000: 1%
Between 1,000 and 3,000: 4%
Between 3,000 and 5,000: 15%
Between 5,000 and 8,000: 35%
Between 8,000 and 12,000: 32%
More than 12,000: 13%
Details here —>
How many state-based conflict deaths in Syria will be reported by ACLED for the month of August, 2025? On Aug. 14, 2025, his pgodzinai bot predicted:
Less than 100: 2%
Between 100 and 250: 8%
Between 250 and 500: 22%
Between 500 and 1,000: 45%
Greater than 1,000: 23%
Details here —>
See all his bot’s past forecasts here —>
Metaculus’s Q2 AI Forecasting Benchmark Tournament Is Over. Three 3 Bots Beat “Community” Humans, But Pro Humans Beat the Bots.
Phil Godzin has joined Jeremy in our side competition with the VIEWS competition. He also ran his pgodzinai during the entire Metaculus AI Benchmark Competition, In this final quarter, Phil’s bot won third place, beating the ordinary human forecasters of the Metaculus Community. Overall, according to Carolyn‘s analysis, Phil’s pgodzinai was the bot that won the most points combined across the entire competition. See the full Q2 leaderboard over time at the foot of this page, or just the final leaderboard here —>
Results Analyzed and Reported from Metaculus’ Q1 AI Benchmark Tournament
Human pro forecasters beat the bots again, but two bots beat the comparatively ordinary human forecasters of the associated Metaculus Community. Final leaderboard here —>
List of bots that comprised the team for the Metaculus Q1 comparison here —>:
metac-o1
metac-o1-preview
metac-Gemini-Exp-1206
acm_bot
twsummerbot
manticAI
metac-perplexity
GreeneiBot2
cookics_bot_TEST
pgodzinai
Metaculus Q4 Results: Pgodzinai Beat All Bots and Human Community Forecasters; Human Pros Barely Edged Win
Recently, Metaculus completed its Q4 analysis and found that human superforcasters beat the bots! But only just barely. Phil’s pgodzinai was the champion bot! Click here to see his overwhelming Q4 win last year.
Question retired:
“Given the agreement of the US International Longshoremen’s Association (ILA) to salary increases, both union and the port returned to the bargaining table on Jan. 15, 2025 to discuss automation and other issues. What’s the probability of a strike in Q1 2025.” Result: No strike with the parties making a final agreement. Botmaster Jeremy and Carolyn Meinel both kept on saying the Multi-AI Oracle was too high. So we humans won.
Another bot retired: The Multi-AI Panel bot that Jeremy fielded in our first Bots vs Humans Competition. He began with four generative AIs, later expanded to five: Perplexity, Claude, Mistral, Cohere, and OpenAI. These bots forecasted just one question through September 16, 2024: “What is the probability that the US Federal Reserve Board will cut interest rates in September 2024?”
We humans beat the bot!
See all our forecasts here —>
More on Bestworldbot’s’s fate:
Our next step with bestworldbot has been using its Metaculus data, along with tall the rest of the Metaculus AI Benchmark Tournament data through the end of June, 2025 to further examine our hypothesis that measurements of integrative complexity can distinguish between GenAI bots and humans.
Click here for our preliminary results.
We also have integrative complexity results on forecasting rationales written by a team of college graduates (Amazon Mturk prime workers) in the 2019 Hybrid Forecasting
Competition. These results substantiated our hypothesis that they used true reasoning in the rationales they wrote for that competition.
We also have results run by AutoIC based on the National Security Estimates written by participants in US National Security Council meetings in 1960 — 1961. These show strong results in all measures of integrative complexity. However, they were poor at aggregating probabilities, as shown by their resulting Bay of Pigs debacle.
Retired: At the end of Q3 of 2024, Jeremy’s bestworldbot finished #53 out of 55 competitors. That was down from #17 on Sept. 10 and having been #2 for twelve days. A likely explanation for bestworldbot’s collapse on the leaderboard is that in early September we began extremizing its forecasts, meaning that below 50%, we would decrease probabilities and above, increase. This was according to a formula (Mellers) proven to work well on humans. Well, we discovered that bestworldbot isn’t like an average human because extremizing made it worse.
