Upgini MLE-Bench Tabular Leaderboard
This leaderboard mirrors the latest changes to Upgini's MLE-Bench leaderboard. It is a version of MLE-bench that compares agent performance on tabular data. It uses exactly the same setup and differs just in the leaderboard view. We focus on tabular tasks and use normalized score instead of medal percentage to compare differently scaled scores. The leaderboard is recomputed upon updating submitted runs from OpenAI repo.
Leaderboard