ML-Master 2.0
Next-generation AI-for-AI agent achieving breakthrough performance through deep integration of exploration and reasoning
Massive Performance Gains
ML-Master 2.0 achieves significant improvements across all complexity levels, with an overall performance increase of 92.7%
Overall (All %)
Low Complexity
Medium Complexity
High Complexity
State-of-the-Art Performance
Detailed performance comparison between ML-Master 2.0 and other agents across all task complexity levels
Performance Comparison: Top Methods on MLE-Bench
✨ ML-Master 2.0 achieves significant improvements across all complexity levels
🎯 Most remarkable improvement in medium complexity tasks at 152.2%
⚡ Overall performance increased from 29.33% to 56.44%, a 92.7% improvement
🏆 MLE-Bench Leaderboard
| Rank & Agent | LLM | Low == Lite (%) | Medium (%) | High (%) | All (%) | Runtime (h) |
|---|---|---|---|---|---|---|
|
1
ML-Master 2.0
|
DeepSeek-V3.2-Speciale | 75.76 ± 1.24 | 50.88 ± 2.86 | 42.22 ± 1.81 | 56.44 ± 2.02 | 24 |
|
2
Leeroo
|
Gemini-3-Pro-Preview | 68.18 ± 2.62 | 44.74 ± 1.52 | 40.00 ± 0.00 | 50.67 ± 1.33 | 24 |
|
3
Thesis
|
gpt-5-codex | 65.15 ± 1.52 | 45.61 ± 7.18 | 31.11 ± 2.22 | 48.44 ± 3.64 | 24 |
|
4
CAIR MLE-STAR-Pro-1.5
|
Gemini-2.5-Pro | 68.18 ± 2.62 | 34.21 ± 1.52 | 33.33 ± 0.00 | 44.00 ± 1.33 | 24 |
|
5
FM Agent
|
Gemini-2.5-Pro | 62.12 ± 1.52 | 36.84 ± 1.52 | 33.33 ± 0.00 | 43.56 ± 0.89 | 24 |
|
6
Operand ensemble
|
gpt-5 | 63.64 ± 0.00 | 33.33 ± 0.88 | 20.00 ± 0.00 | 39.56 ± 0.44 | 24 |
|
7
CAIR MLE-STAR-Pro-1.0
|
Gemini-2.5-Pro | 66.67 ± 1.52 | 25.44 ± 0.88 | 31.11 ± 2.22 | 38.67 ± 0.77 | 12 |
|
8
InternAgent
|
deepseek-r1 | 62.12 ± 3.03 | 26.32 ± 2.63 | 24.44 ± 2.22 | 36.44 ± 1.18 | 12 |
|
9
R&D-Agent
|
gpt-5 | 68.18 ± 2.62 | 21.05 ± 1.52 | 22.22 ± 2.22 | 35.11 ± 0.44 | 12 |
|
10
Neo multi-agent
|
undisclosed | 48.48 ± 1.52 | 29.82 ± 2.32 | 24.44 ± 2.22 | 34.22 ± 0.89 | 36 |
|
11
AIRA-dojo
|
o3 | 55.00 ± 1.47 | 21.97 ± 1.17 | 21.67 ± 1.07 | 31.60 ± 0.82 | 24 |
|
12
R&D-Agent
|
o3 + GPT-4.1 | 51.52 ± 4.01 | 19.30 ± 3.16 | 26.67 ± 0.00 | 30.22 ± 0.89 | 24 |
|
13
ML-Master 1.0
|
deepseek-r1 | 48.48 ± 1.52 | 20.18 ± 2.32 | 24.44 ± 2.22 | 29.33 ± 0.77 | 12 |
|
14
R&D-Agent
|
o1-preview | 48.18 ± 1.11 | 8.95 ± 1.05 | 18.67 ± 1.33 | 22.40 ± 0.50 | 24 |
|
15
AIDE
|
o1-preview | 35.91 ± 1.86 | 8.45 ± 0.43 | 11.67 ± 1.27 | 17.12 ± 0.61 | 24 |
Generated Code
Explore the code generated by ML-Master 2.0 for various competitions
Choose a Competition:
# Select a competition from the dropdown to view the generated code