WWRTW: Opus 4.6 Breaks the Benchmarks, Are You In the Arena, or Out. Claude Code's Creator Reveals All and Closing Thoughts...
What We Read this Week: Top picks to help you grow your biz from our serial founders to you!
We’re back with your founder-friendly news for the week 👋
This week’s reading list circles one big question from a few different angles: how do you know you’re being disrupted by AI… and what do you do about it as a North Carolina founder?
From METR benchmarks getting shattered, to Claude Opus 4.6 triggering “this might be AGI for code” whispers, to Claire Vo’s warning that most teams are still in denial… the signal is getting harder to ignore.
If you’re building in NC and wondering whether this is hype, a cycle, or a genuine phase shift, this one’s for you. But before we dive in, this series is made possible by our sponsors, so let’s give them a quick shout-out 👇
Loving our content? It wouldn’t be possible without our amazing sponsors and our paid subscribers, so THANK YOU!
Featured Gold Sponsor
Featured Gold Sponsor
Featured Silver Sponsor
Opus 4.6 broke the &@*%{^ benchmark! 🚀
METR provides AI benchmarks that are designed to track agents specifically - their run time, complexity of tasks, etc.
In late Nov 25, Anthropic released opus 4.5, every engineer I know started moving to en masse to claude code. Something had changed. Some would even whisper, “I think we are at agi for code.”
Then Claude Opus 4.6 came out in feb 26 and the whispers became fact for the folks on the bleeding edge.
The fact this step-up happened in 3 months was 🤯
METR just released their update, and it gives us third party, measurable proof that these two models along with ChatGPT 5.2 (via codex) are very very big events in our world that many are not really aware of.
We are 100% on a log curve, you can see it. You can FEEL it and now everything is going to be changing and its going to accelerate. It’s a disruption Tsunami the likes of which we have never seen or experienced. We are in uncharted territory.
It’s scary on one level, but I believe as founders and startups, we are uniquely positioned to get in front of the tsunami, hop in our metaphorical surfboards and ride this beast to unprecedented growth and success 🏄
Opus 4.6 broke the benchmark, read these comments from METR about this result.
Rhetorical question: If humans can’t devise a test hard enough to be a hard enough challenge for an agent, what do we call that?!
Must read for every founder/CEO… and then re-read it.
Claire Vo (the “How I AI” podcast host we’ve featured several times) is out with a post we highly recommend every founder/CEO read and re-read after a day to digest it.
If AI adoption had 7 stages of grief, almost all of you would be in denial.
No matter how many AI memos your CEO sends, the amount of Claude that’s being Coded, the chatbots in app and the evals in data--I’m here to tell you: you’re not competing. In fact, you probably can’t anymore.
And you won’t notice until it’s too late.
Are you building a 1-day company or a ‘let’s see how the quarter shapes up’ company? If not the first, you may not make it. Time to make drastic changes.
What happens after coding is solved?
Speaking of Claude, this is probably the best interview on a learnings/minute and signal-to-noise basis I’ve seen in the last year. Boris Cherny works at Anthropic and is the creator of Claude Code and Claude Cowork.
Lenny had a great summary of learnings from the call and I think this pairs perfectly with the Claire Vo piece.
Programming Isn’t “Business as Usual” Anymore
Another spot on post, this time by Karpathy: coding agents basically didn’t work… and then in December, they suddenly did. Not incremental improvement, but rather a phase shift.
He described handing an agent a full weekend infrastructure project: SSH setup, model deployment, benchmarking, server endpoint, UI, systemd, debugging, documentation, and just letting it run. Thirty minutes later, done.
Check out the full article here 👇
There’s also an interesting data point: Stripe’s Patrick Collison is seeing 2026 startups operating at dramatically higher productivity than those from two years ago. When the internet’s money pipes notice a shift, it’s worth paying attention.
Maybe it’s not “singularity.” but it certainly feels like a switch flipped in Q1
Final thoughts
If there’s a pattern across all of this, it’s acceleration. Benchmarks aren’t inching up, they’re breaking under the weight of logarithmic growth. Startups aren’t iterating slightly faster, they’re operating on a different slope. Coding isn’t being “assisted”, it’s being orchestrated -while agents to the work.
You won’t feel disrupted at first. Everything will seem mostly normal… until someone ships in a week what used to take you a quarter.
For NC founders, the opportunity is the same as it’s always been: move early. Adapt faster. Build like the curve is real, because it is. This isn’t about panic. It’s about posture.
Grab a surfboard! 🏄













