ML-Master
Towards AI-for-AI via Integration of Exploration and Reasoning
ML-Master is an AI-for-AI agent that seamlessly integrates exploration and reasoning by an adaptive memory mechanism to automate machine learning development, ranking #1 on MLE-Bench.
Revolutionary AI4AI Agent
ML-Master seamlessly integrates exploration and reasoning through adaptive memory mechanisms
Multi-trajectory Exploration
MCTS-inspired parallel exploration that efficiently navigates solution spaces while maintaining optimal balance between exploitation and exploration.
Steerable Reasoning
Enhanced reasoning capabilities with adaptive memory integration, reducing hallucinations and improving reliability through contextual grounding.
Adaptive Memory
Selectively captures and summarizes insights from exploration trajectories, enabling continuous learning without overwhelming the reasoning process.
State-of-the-Art Performance
ML-Master achieves superior results across all difficulty levels on MLE-Bench
Medal Rate Comparison Across Task Complexities (%)
✨ ML-Master excels across all complexity levels
🎯 Particularly strong in Medium (20.2) and High (24.4) complexity tasks
⚡ More than doubles previous best results in medium-difficulty challenges
Detailed Results on MLE-Bench
Agent | Valid Submission(%) |
Above Median(%) |
Bronze (%) |
Silver (%) |
Gold (%) |
Any Medal(%) |
---|---|---|---|---|---|---|
MLAB | ||||||
gpt-4o-2024-08-06 | 44.3 ± 2.6 | 1.9 ± 0.7 | 0.0 ± 0.0 | 0.0 ± 0.0 | 0.8 ± 0.5 | 0.8 ± 0.5 |
OpenHands | ||||||
gpt-4o-2024-08-06 | 52.0 ± 3.3 | 7.1 ± 1.7 | 0.4 ± 0.4 | 1.3 ± 0.8 | 2.7 ± 1.1 | 4.4 ± 1.4 |
AIDE | ||||||
gpt-4o-2024-08-06 | 54.9 ± 1.0 | 14.4 ± 0.7 | 1.6 ± 0.2 | 2.2 ± 0.3 | 5.0 ± 0.4 | 8.7 ± 0.5 |
o1-preview | 82.8 ± 1.1 | 29.4 ± 1.3 | 3.4 ± 0.5 | 4.1 ± 0.6 | 9.4 ± 0.8 | 16.9 ± 1.1 |
Deepseek-R1* | 78.6 ± 0.0 | 34.6 ± 0.0 | 2.7 ± 0.0 | 4.0 ± 0.0 | 8.0 ± 0.0 | 14.7 ± 0.0 |
R&D-Agent | ||||||
o1-preview | 86.1 ± 1.1 | 32.8 ± 1.2 | 3.5 ± 0.5 | 4.5 ± 0.5 | 14.4 ± 0.5 | 22.4 ± 0.5 |
ML-Master | ||||||
Deepseek-R1 | 93.3 ± 1.3 | 44.9 ± 1.2 | 4.4 ± 0.9 | 7.6 ± 0.4 | 17.3 ± 0.8 | 29.3 ± 0.8 |
* Single run due to resource constraints
Live Demo
Watch ML-Master solve Kaggle competitions in real-time
Choose a Competition: