CaseStudies

MassGen v0.0.3: IMO 2025 AI Winners

This case study demonstrates MassGen’s ability to achieve unanimous consensus through strategic vote switching, where agents recognize and reward superior detail and structure in responses to current events queries. This case study was run on version v0.0.3.

Command:

massgen --config @examples/basic/multi/gemini_4o_claude "Which AI won IMO 2025?"

Prompt: Which AI won IMO 2025?

Agents:

Watch the recorded demo:

MassGen Case Study

Duration: 141.0s 735 chunks 15 events

The Collaborative Process

Initial Response Diversity

Each agent approached the recent AI achievement query with different levels of detail and research depth:

The Strategic Vote Switch

The voting pattern revealed a sophisticated recognition of quality and comprehensiveness:

Initial Position:

The Decisive Shift:

This resulted in a unanimous 3-0 consensus for Agent 2.

Quality Recognition Factors

The agents specifically recognized Agent 2’s superior qualities:

  1. Specific Performance Metrics: “Five out of six problems” rather than just “gold medal scores”
  2. Clear Structural Organization: Bullet-pointed comparison format with distinct sections for each company
  3. Balanced Coverage: Equal treatment of both Google’s and OpenAI’s achievements
  4. Concise Presentation: Streamlined information delivery without sacrificing accuracy
  5. Contextual Details: Inclusion of verification methods and participation status

The Final Answer

Agent 2 presented the final response, featuring:

Conclusion

This case study exemplifies MassGen’s effectiveness in recognizing and promoting superior response quality through strategic vote switching. Rather than agents simply defending their own answers, the system demonstrated sophisticated quality assessment where Agent 1 specifically acknowledged that Agent 2’s additional detail about performance metrics made it “more comprehensive.” The unanimous consensus emerged from agents recognizing concrete improvements in structure, specificity, and presentation. This showcases MassGen’s strength in achieving quality-driven consensus on current events queries, where accuracy is baseline but comprehensiveness and clarity determine the winning response. The system successfully balanced factual accuracy with presentation quality, resulting in a final answer that was both informative and well-organized for optimal user understanding.