Over the summer we’ve been replacing our kitchen. The moment where the old, but functional, kitchen is removed and replaced with piles of dust, damaged walls and grotty floors is sobering. As we sat eating a takeaway surveying the mess it was hard to avoid thinking "what have we done?"
It’s a good example of the premise that to go forwards sometimes you must start by going backwards. In the case of our kitchen quite a long way back.
Despite the chaos I’ve managed to spend some time using Claude and MCP for coding. And it’s rapidly become apparent that Claude only considers going forward. And forward rapidly. But is that a good thing?
Hidden messages
The project I’ve been exploring is an audio steganography app. Steganography is the art of hiding data within audio files. You can do it subtly (e.g. using the least significant bit of an audio stream to encode a message). Or you can do it more overtly - as I did - by encoding text which shows up in an spectrogram.
Spectrograms convert audio into strips of frequency information that are then scrolled across the screen - like the image below.
When I was at uni I built my own spectrogram app - getting it to work on Windows 3.1 on a 486 was fun - there wasn’t much CPU left after doing two real-time FFTs and then bitblt’ing to the screen. At the time my plan was to embed secret messages into the audio and use the spectrum analyser to view them. But I got a job and life got busy. So the idea has languished on my queue of unfinished projects for decades.
Others got further than I did - back in 1999 Aphex Twin included a face in the track 'Equation':
The 2016 soundtrack for Doom includes Spectrograms and others have encoded pictures of cats, people and more.
But it seemed like an interesting project for AI - sufficiently niche to not be common in the training data, requiring visualization (something models struggle with) and about the right size to give a model a challenge but not overwhelm it.
So off to Claude. The first step was to get some relevant context into Claude. Context matters - more relevant information means better output. But Claude is not very good at figuring out what it needs. And there’s only so much work it will do in each turn of a conversation. So I tend to spend the first few prompts building up relevant context. In this case we had a discussion about steganography. Following that I got Claude to write the design.
Just as with humans, getting a good AI design requires decent requirements. And this often requires iteration - as I’m reviewing the design I’ll realise that I missed a requirement. Or want to change something.
In the early days I used to iterate back and forward with Claude on the design in subsequent prompts. But these days I’m much more likely to go back and edit the original prompt - if you hover over a completed prompt there is a little 'Edit' button. Press this and you can branch the conversation.
So you can go back, add in the missing requirements (or change something) and set Claude off again. Of course, there’s a risk the new answer is not as good as an earlier one - models are non-deterministic. Each time you press 'Edit' you are rolling the dice and the new answer might be worse. But if that happens you can just hit edit and try again.
The other trick is to tell Claude to ask questions about the requirements - it can often spot things you might have missed. Although sometimes it just ignores me and proceeds on without asking any questions.
From design to code
Then I can give the design to the desktop-commander MCP to implement. As ever, I’m using Rust. Rust is a great language for simple AI tools - the language is very strict and prevents concurrency and memory issues. Sure, it’s harder & slower to write Rust - but I’m not doing the writing so that’s fine.
The first version didn’t work very well - this is meant to be 'HELLO':
But things rapidly improved:
Notice the missing letters. For some reason Claude had decided to only implement some of the alphabet. This remains a common problem with AI; ask a model to do the same thing repeatedly and it will often skip some examples. That’s a problem with coding; you often want to make the same (or similar) change in multiple places - being told by Claude that is has left some letters for you 'to complete yourself', isn’t helpful. Worse, is when Claude doesn’t tell you it has skipped some work; in this case, I discovered the missing letters myself when testing.
But then we started heading into the coding doom spiral. This is where models can’t resist tinkering; ask them to fix something and they’ll rewrite a different bit of the code, or change something unnecessarily. My mistake was to ask Claude to add support for higher resolution characters - not only did the new characters not work, but they also broke the existing lower resolution ones.
Cue an attempt to fix. But six prompts later, partway through a modification, I ran out of context:
This is not good. Desktop commander was partway through making a series of changes - but they weren’t complete. So now I had a codebase that didn’t compile - and no easy way of getting back to a clean state. The 'edit-a-prompt' trick doesn’t work here - desktop commander has changed the contents of the files on disk.
What I should have done was committed the changes after every prompt completed (I could have got desktop commander to do that for me). That would have given me a way to go back.
With no alternative I pressed on. I opened a new session, told Claude to investigate the project directory and summarize what it found. And then asked it to complete the changes. Six prompts later I was out of context - again partway through a series of changes.
I was still trying to get Claude to restore the basic function; I’d started with two files in the codebase - a single main.rs plus cargo.toml. But when I looked at the codebase a disaster was growing. Instead of two files, there were now 189. Claude had created multiple versions of main.rs - main_adaptive.rs, main_complete.rs, main_fixed.rs, main_backup.rs… It had created 8 supporting python scripts. 23 batch files. And 120 demo wave files.
Yikes.
AI versus human
What’s interesting is the difference between how I’d describe the project and how Claude described it. Here’s what Claude thought:
My view? It’s a non-functional mess. And that’s being polite.
It’s the kind of project that needs a radical overhaul before making any further changes. Where you need to take a big step backwards in order to move forwards.
Now I can get Claude to do that tidy-up for me. But it’s me driving Claude, not Claude making that decision itself. And I can put the scaffolding in place to commit after each change, so that I can go back. But again - it’s me driving Claude.
And that’s the key takeaway. Claude is useful - very useful - at writing code. But it’s got limitations. Claude just isn’t a very good software engineer. It launches into coding without understanding requirements or thinking about the design. It doesn’t use source control to enable it to revert bad changes. It doesn’t recognize when the codebase has turned into a mess.
It only goes forward - and as fast as possible.
The other lesson I learnt is realizing when to cut your losses and start over. I threw away the project, improved the reqs and design with what I’d learnt and now I have a working tool.
It’s a different way of working - but AI enables rapid iteration. Maybe it’s the coding equivalent of if at first you don’t succeed, try and try again?
My model with Claude is 1) make a lot of very fast progress, getting something from zero to working in a few hours and then 2) spending a few days making it right. Working properly, rewritten so it’s maintainable, has a good API, is tested, documented, etc. It is rare that I give Claude a lot of latitude. I may give it a lot of code/docs, but I am careful to work on architecture and design first, and then get it to produce a bit at a time, unless I truly am starting from scratch.