<cd ../feed
mle-bench-evaluating-machine-learning-agents-on-machine-learning-engineering.log
|src: openai.com

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering.