A new benchmark called FrontierMath has been created to assess the mathematical reasoning capabilities of AI models. The benchmark features a collection of challenging problems designed to test AI’s ability to solve complex mathematical problems. The results of the benchmark indicate that current AI systems struggle to solve even a small fraction of these problems, with less than 2% being successfully solved. This highlights a significant gap in the advanced mathematical reasoning abilities of AI, suggesting that there is still substantial progress to be made in this area.