Reports of AI's death have been greatly exaggerated
The day-to-day reality of parallel coding with GPT-5, Claude, and Gemini
In 1897 Mark Twain said: "Reports of my death have been greatly exaggerated." At the time he was 61 and a rumour was spreading that he’d been gravely ill and died. The record got set straight when he showed up alive and well (he’d go on to live until he was 74).
Something similar seems to be happening with AI right now. The botched launch of GPT-5, along with remarks from Sam Altman have been interpreted to mean we’re in the midst of an AI bubble that is about to pop. Adding to the fire, a recent MIT report claims 95% of generative AI projects have delivered little to no growth in revenue.
But… you’ve got to be careful how to interpret this information. Sam Altman actually said "Are we in a phase where investors as a whole are overexcited about AI? My opinion is yes. Is AI the most important thing to happen in a very long time? My opinion is also yes."
The MIT report had a very narrow definition of success - only projects which produced measurable P&L growth within six months of a pilot counted. So measurement failures were treated as AI failures. Other benefits - employee satisfaction, customer engagement, efficiency - didn’t count. And the report mentions but overlooks shadow AI adoption, where employees use their personal accounts to help with work (according to the report, a surprising 90% of employees used personal AI tools for work).
Scepticism is required. It is worth asking the age-old question: what angle are these authors pushing?
The new world of coding
Part of the reason these stories jar is they are so different from my day-to-day experience. In the past year coding has changed radically; a year ago AI provided autocomplete suggestions in VS Code. At the time they seemed amazing - AI seemed to be able to read my mind. But today I’ve got agentic flows that can write designs and then go implement them.
Right now I’ve got five desktops in play on my Win 11 machine. One is using Codex (OpenAI’s terminal AI coding agent). Another Claude Code (Anthropic’s coding agent). Another Gemini CLI (Google’s coding agent). Another has Claude Desktop with Desktop Commander. And the last has GPT-5 with Cursor.
I’m building a markdown viewer in Rust, updating my kloc counter to add support for new languages, building a customized MCP server to integrate GPT-5 with Claude, updating my AI front-end to add extra features I’d like to the chat interface. Oh, and I’m training a custom model in the background.
Yesterday we created over 2kloc of finished code. I’ve not reviewed it all - I couldn’t review that much code in a day - but what I’ve seen looks decent. Plus it seems to work well. I’ve got test scripts that check the outputs, use Rust to take advantage of the compiler safeguards, and use experience to dig into the corners where AI might struggle. Occasionally the tools need steering - recommendations to go search docs, or snippets from older projects to help guide them, or advice on trying different approaches. But mostly they just get on and create.
It’s an amazing experience. My job is to be the tech lead. I switch between desktops and check on progress. Occasionally Claude will pop up a new app on my desktop to signal that it has completed a set of enhancements or fixes. Other tools will 'chime' when they have completed a task and are ready for more work.
Which is not to say it’s perfect. Things regularly go wrong. I frequently run out of Claude tokens and have to wait a few hours for it to reset. Gemini went mad yesterday and used up all my Pro allowance in the space of 90 minutes:
Then there’s Claude, which likes to trumpet its magnificence (although it is frequently wrong):
Codex gets confused:
There are many sharp corners. But these tools are so much better than anything we’ve ever had access to before.
If I pause to think about it, it messes with my mind. In the space of 12 months coding has completely changed. And the models continue to get better. GPT-5 (Codex) is a noticeably stronger model than Claude or Gemini. It almost invariably correctly one-shots code. It needs far less guidance or help than the others. Claude often needs prompting to go and check docs or look for sample code. GPT-5 works this out itself.
At the start of this year I could get Claude to write basic Rust apps for me. But it was painful - many iterations were required to fix compile errors. I had to manually copy and paste code. And frequently step in to fix things. Despite that it was magical because, even then, it was much quicker than doing it myself.
And today? GPT-5 will one-shot the same code. Codex will create the repository for me, write the code, compile it, write and run tests, fix bugs and make a chime sound when it has completed a working executable. Earlier this year apps up to about 500loc were feasible with some effort. Now 1.5kloc just works.
But is the code any good?
Sure, there are situations where it is essential that code is well structured and thoroughly reviewed. Design for maintenance is something I’m passionate about. I’d still pause before letting AI loose on writing production code unsupervised. I’d want to review any and all code carefully before putting it into production; if nothing else I know I won’t enjoy being woken at 2am to sort an AI-induced outage.
But there’s a lot of code where this matters less. And for that code AI is fantastic. The one-off tools you need to help you investigate codebases. Refactor code. Support your development environment. The tools you thought might be useful but couldn’t justify creating because they’d take too long. The apps you build so you can learn and experiment. Suddenly all these things become possible.
Back in the 1880s Thomas Edison started out by selling light - not electricity. Electricity was a better version of the existing gas lights. But over time we found many new uses for electricity that we’d never dreamed of before.
And so?
And we’re seeing something similar with AI right now. AI auto-complete started out by helping us to write code the way we’d always done but faster. But it’s been a year of rapid change. And we’re starting to see how the future might shake out.
Firstly tools like VS Code, Cursor & Windsurf look set to revert to being editors. Cursor was cool because it integrated agentic workflows. But now I can add Claude Code - or Codex, or Gemini - as an extension. And they are better. So Cursor is just an editor. Cursor and Windsurf must be worried - after all they are just forks of VS Code and the CLI tools are set to steal their lunch.
Secondly the tools will get better at creating larger amounts of code. Conservatively, we’ve gone from 500loc to 1.5kloc in the past six months. What will the next six months bring?
The tools will become more reliable. This matters in the enterprise. We’re some way off being able to use AI to generate production code, but as the quality improves so does trust. These days we trust compilers to turn our code into machine code; gone are the days where we manually check the emitted assembler. Will we get to the same level of trust with AI being able to turn designs into code?
We’ll keep learning how to use the capability we’ve got. Sure, better models have helped the improvements we’re seeing with coding. But improved infrastructure has also played a big part - tools like Claude Code are a world apart from simple chat interfaces.
Personally I love the days I get to spend with these tools. I’m able to do so much more than I ever could before. If I’m honest it’s addictive - exploring the new tools, finding out what they can do, dreaming up new projects to experiment with. From where I sit it seems like reports of AI’s bubble have been greatly exaggerated.




