Alex Maraval
About the Presenter
Alex Maraval is a Senior ML Engineer at Huawei Noah's Ark Lab in London. He graduated from EPFL (Lausanne, Switzerland) in Mathematics and completed a Master's degree at Imperial College London in Machine Learning. He joined the Decision Making and Reasoning Team in the London Research Centre in 2020 where he started working on Variatonal Inference, Reinforcement Learning with a focus Gaussian Processes and Bayesian Optimization (BO). Alex contributed to a multitude of projects including research on High-Dimensional BO on structured spaces, BO on Graphs, ... He contributed to several publications in top-tier conferences including state-of the art algorithm HEBO and is the first author of Meta-Learning for BO with Transformer Neural Processes, published at NeurIPS 2023. More recently, Alex has been focusing on LLMs related projects. His research directions include building specialized Agents, extending RAG techniques, researching more performant optimizers and improving fine-tuning.
Abstract
Data science has long been essential, driving ongoing efforts to create agents capable of tackling complex tasks autonomously. While many such agents exist, they are often limited in scope, needing end-to-end automation or falling short in performance. In this talk, we'll explore Agent K, a new, fully autonomous agent that achieves both complete end-to-end automation and high-level performance. We created a benchmark based on real Kaggle competitions to evaluate its capabilities. Agent K achieved a 92.5% automation success rate in testing, verified through unit tests, and demonstrated its ability to handle multimodal tasks across diverse domains. Agent K consistently ranked in the top 38% among human competitors in these same competitions. Additionally, Agent K earned two bronze medals in featured competitions, making it an (unofficial) Kaggle expert. Furthermore, results comparable to achieving six gold, three silver, and seven bronze medals across all competition types-an unofficial skill set similar to Grandmaster-level medal performance.