Code bloat in the age of AI
Will AI reduce code bloat & complexity?
Self-storage is booming. According to the UK Self-Storage Association’s annual report two square miles are dedicated to self-storage units. The market grew by 8% last year. And it’s profitable too - the cost of hiring a telephone box sized space is £260 per year.
Why do we suddenly need so much storage? I think we all know. We don’t like throwing things out. We’ve invested time acquiring our possessions. Even if we don’t need them now, they might be useful in future. Plus, it requires mental energy to permanently dispose of them. It’s easier just to pay to store them.
Software is no different. Any large codebase is riddled with vestigial code. Once useful but now obsolete code kept in digital storage units “just in case”. Code to support old operating systems, unused features, forgotten programmable APIs.
Time pressure doesn’t help. It discourages careful design – the vendor who gets to market first invariably “wins”. Why invest time in building small, tight code when you can knock something rough and ready together and get to market sooner?
Then there is open source. Don’t get me wrong. Open source is fantastic. You’d be mad to use anything other than OpenSSL if you need a secure socket library. It avoids reinventing wheels. It can be a massive productivity boost. But any open source library will inevitably solve a larger problem than the one you have. You get extra function – and code – you don’t need. I’ve seen projects where indiscriminate use of open source has pulled in tens of thousands of lines of unnecessary code. It’s a bit like adding a porch to your house, only to discover that the porch is attached to a massive castle. Your codebase can get very large very quickly.
And then we’ve got the human element. No engineer likes deleting code they’ve spent months building. It’s part of them. Deleting code isn’t quite as painful as chopping off an arm, but it’s close.
Why does this matter?
It’s insecure. The bigger your codebase, the bigger the attack surface. It’s no longer enough to lock the doors and windows of your house. Now you have to do the same for the castle as well. Less code is more secure.
It’s hard to maintain. All code needs maintenance. Bugs are uncovered. Security holes open up. APIs and platforms change. It’s obvious – the more code you have, the more maintenance is required. But this law of physics inevitably gets lost in the rush to ship products.
It’s hard to extend. Adding new features requires modifying existing code. You’ve got to understand how to integrate the new code. Add it to your test framework. Add it to your documentation. The bigger the codebase the harder it gets. And the greater the risk you do it wrong and duplicate code. You’re adding a porch. It’s obvious where it goes if you only have one front door. But what if you’ve got five front doors? Which ones get a porch? What if some of the doors are rarely used? Do they get a porch?
Your code gets slower. The more code you have the slower it will run. There’s more code to load. More code to execute. It’s harder to optimize. Word 2000 is 25 years old. It’s also two orders of magnitude smaller than current versions of Word. It runs like lightning on current hardware. And, for what it’s worth, it does pretty much everything the current version of Word does. Progress, eh?
Thirty years ago, you could rely on hardware getting faster to offset ever larger and slower code. That’s not the case now, so software gradually slows down. I’ve got a spreadsheet which tracks my electricity usage over the past 10 years. It takes 5 seconds to save with the latest Excel. The same spreadsheet saves instantaneously with Excel 2000. Progress, eh?
Developing software is hard. Writing code is hard. We don’t want to throwaway the things we’ve previously built. Instead, we slowly tweak them. Recombine them in new ways. They grow and grow. And so does complexity. The Linux kernel now has ~30 million lines of code (depending on how you count). In 2000 it was a tenth of the size. The upper limit for a developer understanding a codebase is about 500 thousand lines. No-one can understand the codebase. Duplication thrives. Complexity creates more complexity.
AI provides an opportunity to fix this. We can use AI to build right-sized software. Software which does what we require and nothing else. Software which is lean and fast. Software that has a minimal attack surface. That is cheap to maintain. Easy to fix.
But AI also makes it very easy to write lots and lots and lots of code. It’s easy to get o1 or Claude to create 500loc per minute. They never get tired. They don’t need to sleep. They don’t argue – they just do as you ask. The Linux kernel is ~140kloc. That’s 280 minutes – 5 hours. OK, so there’s more to software development than just writing code, but you get the idea.
But what will happen?
Security could drive a push for right sized solutions with reduced attack surfaces. But we know security comes second to commercial pressures. If no one is buying your product because you were late to market, then it doesn’t matter if it’s secure. The same goes for maintenance. Time to market remains critical. And if time to market remains critical then will the old ways persist? Are we doomed to large, bloated codebases?
In the short term, yes. In a world where generating code is essentially free then it doesn’t matter whether you produce lots or little code. What matters is design. And while that’s done by humans, we will continue to trade design time for time to market. Market success requires this trade.
But in the long run I’m more optimistic. AI will take over design. We’ve already seen the start of this with tools like Devin & Pythagora. And when design takes negligible time there is no penalty for producing good designs. If a design that improves security and reduces maintenance costs the same as one that doesn’t, then the AI will choose the former.
AI will make it possible to build right-sized solutions and remove time-to-market as a key differentiator. That leaves a space for security, maintenance and performance to become the differentiators.
In this world it’s not just new products that will benefit. Once AI can reason over a whole codebase, it becomes easier to remove vestigial code. Unlike us, it doesn’t have any emotional ties to the code it wrote. We’ll be able to get it to build a new porch to replace the one attached to the castle. To remove the unneeded front doors.
If we’re brave, it’ll become significantly cheaper to replace existing legacy products with new right-sized products that are fast and secure. Klarna are leading the charge here – they’ve recently severed ties with two of their SaaS providers and intend to replace with AI generated inhouse solutions. They are the first, but they won’t be the last.
And then maybe, just maybe, I’ll be able to save my spreadsheet in less than a second 😊.

