looking at the arc agi leaderboard is pretty wild. you see models like gpt 5.2 hitting scores in the 60s while gemini 3 flash preview gets there for way less cost. the whole cost versus performance thing is getting interesting.
what stands out is how fast things are moving. models that were cutting edge last year are now mid tier. and the efficiency gains are real. some of these newer models do more with less compute, which matters when you think about scaling.
where does this go? hard to say exactly but the trend is clear. we're seeing better reasoning, lower costs, and models that can handle more complex tasks. the gap between what humans can do and what ai can do keeps shrinking.
the interesting part is what happens when these models get good enough at general reasoning. not just pattern matching but actual understanding. that's when things get really interesting. we might see ai that can genuinely help with research, problem solving, creative work in ways that feel more like collaboration than automation.
but who knows. maybe in a year we'll look back at these scores and laugh at how primitive they seem. that's usually how it goes with this stuff.