The first public benchmark for AI model accuracy on real startup validations — based on 90-day outcomes from real founders.
Which model verdicts proved correct after 90 days.
Which assumptions all 8 models consistently get wrong.
AI debate scores vs. actual success rates by industry.
Team USA vs Team China — accuracy breakdown comparison.
Get notified when the Research Report is released.