Humans vs Bots

Jeremy Lichtman’s Multi-AI Oracle

How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in June, 2025?
Here’s what Jeremy’s Multi-AI Oracle predicted:
* Less than 500: 1%
* Between 500 and 1000: 60%
* Between 1000 and 1500: 20%
* Between 1500 and 2000: 15%
* Greater than 2000: 4%

Details here —>

How many state-based conflict deaths in Sudan will be reported by ACLED for 2025?
 June 13, 2025, Jeremy Lichtman’s Multi-AI Oracle predicted: 

* Less than 1000: 1%
* Between 1000 and 3000: 8%
* Between 3000 and 5000: 19%
* Between 5000 and 8000: 37%
* Between 8000 and 12000: 26%
* More than 12000: 9%

Details here —>

>
How many state-based conflict deaths will be reported by ACLED in Ukraine in June, 2025? On  June 10, 2025, Botmaster Jeremy Lichtman’s Multi-AI Oracle forecasted:

* Less than 500: 18%
* Between 500 and 1000: 29%
* Between 1000 and 1500: 29%
* Between 1500 and 2000: 18%
* Greater than 2000: 6%

Details here —>

See all his bot’s past forecasts here—>

At the bottom of this page, you can see the Metaculus AI Benchmark Tournament Q4 Leaderboard as of June 5, 2025. Highlight colors code for previous holders of first place, excepting jbot, in yellow, competing for its first time and belonging to Jeremy Lichtman. Phil’s pgodzinai is highlighted in orange.

Phillip Godzin’s pgodzinai bot

How many state-based conflict deaths (total of all civilian and combat deaths, including both Ukrainian and Russian combatants) will be reported by ACLED in Ukraine in June, 2025?
Here’s what Phil’s pgodzinai predicted: 
Less than 500: 2%
Between 500 and 1000: 18%
Between 1000 and 1500: 45%
Between 1500 and 2000: 25%
Greater than 2000: 10%

Details here —>

Will hostilities between Pakistan and India result in at least 100 total uniformed casualties (with at least one death) between 2 June 2025 and 30 September 2025?
Here’s what Phil Godzin’s pgodzinai bot predicted  on June 16: 8%
Details here —>


How many state-based conflict deaths in Sudan will be reported by ACLED in 2025
?  June 13, 2025, Phillip Godzin’s pgodzinai predicted:

Less than 1000: 2%
Between 1000 and 3000: 12%
Between 3000 and 5000: 22%
Between 5000 and 8000: 38%
Between 8000 and 12000: 20%
More than 12000: 6%

Details here —>


How many state-based conflict deaths in Syria will be reported by ACLED for the month of May 2025?
Here’s what Phillip Godzin’s pgodzinai predicted June 12, 2025:

Less than 100: 45%
Between 100 and 250: 3%
Between 250 and 500: 15%
Between 500 and 1000: 8%
Greater than 1000: 2%


See all his bot’s past forecasts here —>How many state-based conflict deaths will be reported in Sudan by ACLED in 2025

Metaculus’s Q2 AI Forecasting Benchmark Tournament Has Launched

You still can join. Have fun, maybe win a chunk of its $40,000 prize pot. Metaculus offers instructions on how to build your own bot. We humans also may compete. It’s fun either way!

Our Botmaster Jeremy Lichtman is running a new bot on this competition, jlbot. Phil Godzin, who has joined Jeremy in our side competition with the VIEWS competition, also has a bot in this Q2 competition, pgodzinai. See the latest leaderboard at the foot of this page.

The Metaculus 2024 Q4 tournament has ended. From its website, ”This is the 3rd tournament in our $120,000 series designed to benchmark AI forecasting capabilities against top human forecasters on complex, real-world questions. Recently, Metaculus completed its Q4 analysis and found that human superforcasters beat the bots! But only just barely. Phil’s pgodzinai was the champion bot! We don’t yet have a humans vs bots results from its recently completed Q1 competition.

The Q2 competition began April 21, 2024
It still isn’t too late to compete. Thanks to the technical helpMetaculus offers, anybody can build a bot and play. Have fun, win money! The history of the Q2 leaderboard at the foot of this page shows that new competitors have been joining every few days. If you join, they will start you in the middle of the leaderboard. 

Question retired:  “Given the agreement of the US International Longshoremen’s Association (ILA) to salary increases, both union and the port returned to the bargaining table on Jan. 15, 2025 to discuss automation and other issues. What’s the probability of a strike in Q1 2025.” Result: No strike with the parties making a final agreement. Botmaster Jeremy and Carolyn Meinel both kept on saying the Multi-AI Oracle was too high. So we humans won.

Another bot retired: The Multi-AI Panel bot that Jeremy fielded in our first Bots vs Humans Competition. He began with four generative AIs, later expanded to five: Perplexity, Claude, Mistral, Cohere, and OpenAI. These bots forecasted just one question through September 16, 2024: “What is the probability that the US Federal Reserve Board will cut interest rates in September 2024?” 

We humans beat the bot! See all our forecasts here —>

More on Bestworldbot’s’s fate:

Our next step with bestworldbot has been using its Metaculus data, along with tall the rest of the Metaculus AI Benchmark Tournament data through the end of June, 2025 to further examine our hypothesis that measurements of integrative complexity can distinguish between GenAI bots and humans.

Click here for our preliminary results
We also have integrative complexity results on forecasting rationales written by a team of college graduates (Amazon Mturk prime workers) in the 2019 Hybrid Forecasting
Competition. These results substantiated our hypothesis that they used true reasoning in the rationales they wrote for that competition.

We also have results run by AutoIC based on the National Security Estimates written by participants in US National Security Council meetings in 1960 — 1961. These show strong results in all measures of integrative complexity. However, they were poor at aggregating probabilities, as shown by their resulting Bay of Pigs debacle.

Retired: At the end of Q3 of 2024, Jeremy’s bestworldbot finished #53 out of 55 competitors. That was down from #17 on Sept. 10 and having been #2 for twelve days. A likely explanation for bestworldbot’s collapse on the leaderboard is that in early September we began extremizing its forecasts, meaning that below 50%, we would decrease probabilities and above, increase. This was according to a formula (Mellers) proven to work well on humans. Well, we discovered that bestworldbot isn’t like an average human because extremizing made it worse.

Author