The name's Claude. Claude, Claude, Claude
When your job becomes managing the thing that manages the things
As a kid the special agent James Bond featured regularly in my life - gathering with my family around the TV to watch his latest escapade. Or, more normally, a re-run of an older film recorded on a not-quite-long-enough VHS tape and hoping against hope that the ending wasn’t cut off…
But these days it’s a different kind of agent that’s featuring in my life. Claude agents.
One of the neat things Claude Code added a while back was the ability to use agents. The pitch was "create specialized agents - design/coding/debugging." But how do you know which agent to use? And when? And how many? Can they interact with each other? So many questions…
But as I’ve used them more, I’ve realised trying to stereotype the agents misses the point. Just using agents is a significant unlock.
There are a couple of things they do well:
First they increase parallelization. Claude will often offer a list of things to do and ask you to pick one. But with agents you can tell Claude to do all the options in parallel. Claude is pretty good at coordinating the agents to avoid clashes (so two agents won’t end up fighting over the same code). Going faster is good.
Second it preserves context in the main conversation - the one you interact with. Managing context matters. Sure, Claude will automatically compact context. But each time it does it risks throwing away something critical. And even if you don’t run out, having the context polluted with irrelevant data is hard for the model - where should it focus its attention? A sub agent can sort out a bug and just report back "fixed". Or do a web search and produce the necessary nugget without any unnecessary kerfuffle. The main context is preserved for what matters. It’s like a senior manager who is focused on the high-level picture.
But it’s not just Claude agents that Claude is happy managing. It will quite happily boss Gemini and Codex around as well…
As an example, let’s get Claude to use Gemini to fix some failing tests…
And, like me, Claude was excited when Gemini made progress:
But it wasn’t all plain sailing - Claude had some pointed feedback for Gemini:
Once again we see one of the problems with agents - the controlling instance decides that the subagent isn’t up to the task and decides it’s easier to just do the work itself. There are definite echoes here of inexperienced managers who decide its quicker to do a task themselves rather than figure out how to get their team to do it.
But we see the benefit of two models working together - they riff off each other - and produce better results. There are interesting echoes of a paper from Google last week which found that advanced reasoning models don't just reason better because of longer chains of thought. Instead they implicitly simulate a "society of thought" internally. The model effectively runs a multi-agent debate within itself, with different internal "perspectives" with distinct personality traits arguing, questioning and reconciling multiple views.
But getting back to the problem in hand. How would Claude rate Gemini?
AI interviews
The obvious next step is to run an interview with Gemini, Codex & Claude as the candidates…
Claude proposed a rather dull interview test: writing a tokenizer.
I had a better idea. Let’s do something in 16-bit DOS, with a 30 year old compiler…
And soon we had our interview question:
All three completed the task - although the time taken and amount of struggle varied.
And would you believe it, the hire recommendation was, err, Claude!
And so?
You know what? Having worked with these models over the past few months, I’ve come to similar conclusions. Gemini has many strengths. But for this type of systems level coding it is too slap-dash. We’ve recently been fixing bugs in my Win16 emulator introduced when Gemini decided to use fastcall rather than pascal as the calling convention. It’s a bit like the junior who doesn’t properly understand - and rather than go and learn just makes assumptions.
And Codex? Codex is smart. But oh, so slow. And that’s on the "high" thinking level, not the even slower "Extra high" level.
But what’s really interesting here is that I seem to be moving up. Two years ago Github Copilot assisted me. Last year I found myself managing a team of agents. And now? Now, I seem to be on the brink of being a second line manager - managing Claude who manages the team doing the work.
Two years from Copilot assistant to second-line manager. Amazing career progression. If the pattern holds, where am I in another two? Third-line, I guess, managing the Claude that manages the Claudes that manage the Claudes.
Bond always knew what his role was, but these days I’m starting to wonder if I'm climbing a ladder or being gently escorted to the exit.












