🏆 #1 on MLE-Bench Leaderboard

ML-Master 2.0

Next-generation AI-for-AI agent achieving breakthrough performance through deep integration of exploration and reasoning

📄 Read Paper 🔍 ML-Master 1.0

Massive Performance Gains

ML-Master 2.0 achieves significant improvements across all complexity levels, with an overall performance increase of 92.7%

Overall (All %)

29.33%

ML-Master 1.0

56.44%

ML-Master 2.0

🚀 +92.7% Improvement

Low Complexity

48.48%

ML-Master 1.0

75.76%

ML-Master 2.0

📈 +56.2% Improvement

Medium Complexity

20.18%

ML-Master 1.0

50.88%

ML-Master 2.0

⚡ +152.2% Improvement

High Complexity

24.44%

ML-Master 1.0

42.22%

ML-Master 2.0

💪 +72.8% Improvement

State-of-the-Art Performance

Detailed performance comparison between ML-Master 2.0 and other agents across all task complexity levels

Performance Comparison: Top Methods on MLE-Bench

Performance (%)

Low

Medium

High

Overall

Task Complexity Levels

ML-Master 1.0

AIRA-dojo (Meta)

R&D-Agent (Microsoft)

CAIR MLE-STAR-Pro-1.5 (Google)

ML-Master 2.0 (Ours)

✨ ML-Master 2.0 achieves significant improvements across all complexity levels

🎯 Most remarkable improvement in medium complexity tasks at 152.2%

⚡ Overall performance increased from 29.33% to 56.44%, a 92.7% improvement

🏆 MLE-Bench Leaderboard

Rank & Agent	LLM	Low == Lite (%)	Medium (%)	High (%)	All (%)	Runtime (h)
1 ML-Master 2.0	DeepSeek-V3.2-Speciale	75.76 ± 1.24	50.88 ± 2.86	42.22 ± 1.81	56.44 ± 2.02	24
2 Leeroo	Gemini-3-Pro-Preview	68.18 ± 2.62	44.74 ± 1.52	40.00 ± 0.00	50.67 ± 1.33	24
3 Thesis	gpt-5-codex	65.15 ± 1.52	45.61 ± 7.18	31.11 ± 2.22	48.44 ± 3.64	24
4 CAIR MLE-STAR-Pro-1.5	Gemini-2.5-Pro	68.18 ± 2.62	34.21 ± 1.52	33.33 ± 0.00	44.00 ± 1.33	24
5 FM Agent	Gemini-2.5-Pro	62.12 ± 1.52	36.84 ± 1.52	33.33 ± 0.00	43.56 ± 0.89	24
6 Operand ensemble	gpt-5	63.64 ± 0.00	33.33 ± 0.88	20.00 ± 0.00	39.56 ± 0.44	24
7 CAIR MLE-STAR-Pro-1.0	Gemini-2.5-Pro	66.67 ± 1.52	25.44 ± 0.88	31.11 ± 2.22	38.67 ± 0.77	12
8 InternAgent	deepseek-r1	62.12 ± 3.03	26.32 ± 2.63	24.44 ± 2.22	36.44 ± 1.18	12
9 R&D-Agent	gpt-5	68.18 ± 2.62	21.05 ± 1.52	22.22 ± 2.22	35.11 ± 0.44	12
10 Neo multi-agent	undisclosed	48.48 ± 1.52	29.82 ± 2.32	24.44 ± 2.22	34.22 ± 0.89	36
11 AIRA-dojo	o3	55.00 ± 1.47	21.97 ± 1.17	21.67 ± 1.07	31.60 ± 0.82	24
12 R&D-Agent	o3 + GPT-4.1	51.52 ± 4.01	19.30 ± 3.16	26.67 ± 0.00	30.22 ± 0.89	24
13 ML-Master 1.0	deepseek-r1	48.48 ± 1.52	20.18 ± 2.32	24.44 ± 2.22	29.33 ± 0.77	12
14 R&D-Agent	o1-preview	48.18 ± 1.11	8.95 ± 1.05	18.67 ± 1.33	22.40 ± 0.50	24
15 AIDE	o1-preview	35.91 ± 1.86	8.45 ± 0.43	11.67 ± 1.27	17.12 ± 0.61	24

Generated Code

Explore the code generated by ML-Master 2.0 for various competitions

Choose a Competition:

ML-Master 2.0: Select a Competition

# Select a competition from the dropdown to view the generated code