What’s the first thing you do when you land in a new destination? If you’re like me, you immediately take your phone off of airplane mode and try to connect to data so you can shoot off a ‘just landed ...
Now, Claude Sonnet 4.5 has lapped that last model, outperforming it on the SWE-bench Verified evaluation, a human-filtered subset of the SWE-bench. Claude Sonnet 4.5 also outperformed leading models ...