No comments allowed
Why code comments are becoming a net negative
Many years ago I visited our head office. Not having my own desk, I borrowed a colleagues desk. A colleague who had a complex system of post-it notes carefully arranged around their keyboard.
That afternoon I was horrified when I absent mindedly brushed my arm over the desk only to discover my arm was now covered in post-it notes. Disaster! I quickly put them back as best I could, hoping I hadn’t caused some terrible calamity to their project…
Making note
All of us take notes to some degree. They’re short-hand for things we’ve worked out at length. And we need them because working everything out from first principles every time is way too time consuming and complex. Sure, there’s a risk that the note isn’t quite right - but it’s the best we can do.
Code is like this too. Working out the logic in a particular function is hard. It takes time. So we use comments. They’re an easy way to describe in words what a function, or a routine, or a module does.
It’s a great idea. In theory at least. Reality is often a bit different. Many codebases have patchy commenting. And even when they do exist they can be less than helpful:
// Add 5 to x
x+=5;Hmm; I can work that out myself. More interesting: why 5? And why x?
But it can get worse:
// Add 5 to x
x+=6;Now we have a puzzle. Is the comment wrong? Or the code? Or both? Maybe this is an innocent typo. Or an indication of confusion. Bad merge anyone?
Those are trivial examples. It can (and does) get much worse. I’ve lost hours of my life wading through code where the block comment describing a module has been subtly wrong. The time the locking hierarchy was backwards. Or the comment which described an earlier version of the logic. The inconsistent set of comments between modules, resulting in an unsettling feeling of what to believe.
Comments are, at best, a mixed bag.
Enter AI
You might hope that AI makes this better. AI models would write better comments. And be more resilient to bad/misleading/wrong comments.
But that isn’t how it plays out. It seems the models have learnt too well from us humans; they too favour sparse comments. A particular favourite is not updating comments when the underlying code changes. Refactor a function and Claude will give you new code. And old comments.
This causes confusion down the line. My emulator codebase has undergone a lot of change recently (~400 commits over three days). The comments have fallen out of sync with the code. Claude.md is a work of historical fiction.
The result?
Claude is struggling. It regularly tells me about things I know not to be true. It’s like going through a weird time-warp. It confidently tells me about bugs that I know are fixed.
or
Claude, like humans, takes the shortcut that comments offer. It looks for the quickest way to answer a question. And it doesn’t bother checking against the code.
But LLMs aren’t humans. There isn't much time difference between an LLM reading a comment and reading the code itself. For a human, comments save minutes - hours - of painstaking analysis. For an LLM, they save seconds - if that. Which is nothing compared with the cost of the confusion they cause when they're wrong. A human resolving that confusion? That takes real time. At best you spend minutes untangling it. At worst you don't notice until something breaks.
My mistake
At this point I made a mistake. I got Claude to make multiple sweeps through the codebase to update the comments.
Which, on reflection, was an error. Because there’s a fundamental problem with comments.
They aren’t testable.
Code that’s wrong breaks. Claude can find - and fix it. But wrong comments? They just lurk - waiting until the day where they can trip someone up.
And then I had a realisation.
Comments are no longer useful.
Claude is more than capable of working out what the code does from first principles. Sure, it needs a high-level picture of the architecture, key flows, data sources & sinks. It needs to understand the purpose of the product. Who the users are. What the quality goals are.
But it doesn’t need blow by blow comments in the code. Those will rot. And lead to confusion. In a world where humans no longer read the code we don’t need them either.
There is a nuance to this:
"What" comments are harmful for LLMs. It’s better - safer - for Claude just to read the code. These have had their day; they have no place in AI generated and maintained codebases.
"Why" comments are more interesting. Explaining the philosophy behind a design is arguably more useful. Anthropic found this with Claude’s soul document - which focuses on the why - why should Claude behave in particular way? They are more important than ever before.
And so?
Software engineering is changing. Things we held important in the past will not matter in the future. Human code review is one. Detailed comments another.
Software engineering has been optimised around the limitations of humans. Comments exist because understanding code from first principles is expensive. For humans, reading a function, tracing the logic, understanding the intent - it’s tough.
LLMs have different limitations. Comments set a trap for LLMs. They don’t help; they actively hinder. We need to remove that trap.
Instead we need to spend time explaining the bigger picture. Why does this product exist? Who uses it? What are the quality goals? What’s the architecture?
That’s readme.md. Or claude.md Or agents.md. We still need documentation - but we need to capture the intent and purpose. It’s a different way of thinking. It’s thinking more about the forest and less about the trees.
Turns out my colleague's post-it notes and code comments have more in common than I'd thought. Both work brilliantly - right up until they don't.




I'm not convinced. It feels like you are arguing that humans don't need to understand codebases anymore. I don't think we are at that point yet. I've seen Claude go round in circles needing someone to point it in the right direction too many times.
"Comments are a code smell" has been a widely held opinion for a couple of decades.
I think it's directionally correct, though sometimes taken too far.
Expressive, self-describing code is almost always preferable to inexpressive code with comments.
I *think* that's going to continue to be true in the Agentic coding era.