AI updates
2024-12-22 01:08:36 Pacfic

OpenAI o3 Achieves Breakthrough on ARC-AGI - 18h
Read more: www.techmeme.com

OpenAI's new o3 model has achieved a significant breakthrough on the ARC-AGI benchmark, demonstrating advanced reasoning capabilities through a 'private chain of thought' mechanism. This approach involves the model searching over natural language programs to solve tasks, with a substantial increase in compute leading to a vastly improved score of 75.7% on the Semi-Private Evaluation set within a $10k compute limit, and 87.5% in a high-compute configuration. The o3 model uses deep learning to guide program search, moving beyond basic next-token prediction. Its ability to recombine knowledge at test time through program execution marks a major step toward more general AI capabilities.

The o3 model's architecture and performance represents a form of deep learning-guided program search, where it explores many paths through program space. This process, which can involve tens of millions of tokens and cost thousands of dollars for a single task, is guided by a base LLM. While o3 appears to be more than just next-token prediction, it’s still being speculated what the core mechanisms of this process are. This breakthrough highlights how increases in compute can drastically improve performance and marks a substantial leap in AI capabilities, moving far beyond previous GPT model performance. The model's development and testing also revealed that it cost around $6,677 to run o3 in "high efficiency" mode against the 400 public ARC-AGI puzzles for a score of 82.8%.