The 168 hour week
How hard can I work my AI friends?
The other day the conversation around the dinner table turned to how work life differed from school. Especially hours worked. It’s possible I imagined it, but I sensed a flicker of disappointment as it dawned on my kids working consumed more hours than school - perhaps they had realised there would be less time for video games once they entered the world of work…
But the conversation also reminded me that there are 168 hours per week. And we only work ~40 of them. But, as my son pointed out AIs don’t need to sleep - or play video games. Which got me thinking. I’ve got access to a swarm of AI coding agents. Could I get them working 24x7?
Merges…
One of the problems with Codex is that it shrinks software. In the past a 100kloc codebase could accommodate a team of ten with ease. Those engineers could all happily work on changes with minimal conflicts. But now?
One of my projects is about 200kloc. But even having two people on it turns into a nightmare of merge conflicts. I found myself rushing to get changes in to avoid being 'it' - the person who has to shepherd Codex through a couple of hours of merge conflicts. And if you are the person who has the most recent check-ins then a different pressure emerges - the pressure to stay ahead. I found myself trying to check in something every couple of hours to try to stay ahead and avoid merge pain (sorry Matt).
Codex is a drug. One agent is never enough. Once you realise you can run multiple in parallel - implementing multiple changes in parallel - you’ll never look back. Conventional wisdom says you should make these changes in independent branches. So I tried that - using git worktrees (i.e. multiple branches checked out within the same local repo). You can probably guess how that went. Not content with having conflicts with another human, how I had extra conflicts with myself. Oops.
Worktrees were abandoned quickly. Nice idea, but one for an older time.
So what should you do? So far I’ve found two things.
Breaking with convention
The first is to break conventional wisdom. Conventional wisdom says you should never make multiple changes at the same time. But what if you ignore that and allow multiple agents to make changes simultaneously to the same repo? Sure, they’ll clash. But I’ve found they seem able to cope. I warn my agents via agents.md that other agents are working on this codebase and they should expect to find things changing under their feet.
I can imagine all sorts of ways this will go wrong (multiple agents trying to fix the same build issue simultaneously?) But, so far, it has worked. Having said that I try to minimize the chance of conflict by getting them to work on different changes - so conflict is only in common areas like config, or build systems. I’m sure I could make it go much worse by getting them to fix the same bug…
One disadvantage of this system is it means check-ins contain multiple features. Not ideal, but I’ll happily sacrifice that to avoid merge pain.
Sequential, not parallel
The second is harder to implement. Remember those 168 hours per week? What if I can have a single agent working for 168 hours rather than four in parallel for 40 hours? One change at at time. No conflicts.
It sounded good. But I soon discovered a couple of practical problems.
First, I need to sleep. Which means eight hours away from Claude and Codex. Current agents can’t run for eight hours unattended. The best you can hope for is a couple of hours - at least for the majority of tasks.
Now, I love Claude and Codex. But I’m not getting up through the night to unblock them. I did enough of that when my kids were little; those days are over.
OK - so that means I can’t exploit overnight as effectively as I’d like. But there are still evenings and weekends. If I have a set of tasks queued up, then I can feed Codex something new every few hours. And that does work. Provided you have a pipeline…
You need a set of tasks to feed the model with. And tasks fall into three camps:
Those where I know what needs to be done.
Those I know need doing but I’m not yet sure how to do (often because they depend on previous tasks).
Those I don’t know about yet.
Ideally I want all my tasks to be in the first category - because then I can sketch them out during the working week and feed them to the model over time. Tasks in category two can often be semi-sketched during the working week; but, by definition, you can’t do anything about category three ahead of time.
In theory this then means checking in on the models every few hours to keep them moving along (and during evenings and weekends). Check the previous task has completed, queue up the next task and leave them to it.
In practice it’s much messier than that. It seems many of my tasks fall into category two. So there’s often a bit of tweaking required to keep things moving. It turns out I quite like this (and it works for me since I work at home). But it’s obviously not viable for everyone.
I also have a set of filler tasks - things that have value but which aren’t critical. These are things like running (and fixing) the test suite. Or improving code coverage. Or updating documentation. Or reviewing all the code and applying markups. Or exploratory testing. Those turn out to be nicely shaped tasks which can be run independently overnight - and which are best done when there are no other changes being made to the codebase.
So I can get quite a bit more than 40 hours from my AI friends - probably nearer to 80 hours in a good week. It’s still quite a bit short of 168 hours, so there’s more I could do. And it requires me being willing to (a) build a pipeline and (b) baby sit the AI models. Which is apt - getting the most from AI feels like looking after very young kids - at best you get a few hours to yourself before you are needed again. Fortunately neither Codex nor Claude has been sick over me. Plus I can choose to ignore them and nothing bad will happen. So there are some benefits.
And so?
Is this sustainable long term? No. It’s clearly absurd. But step back for a second and consider what I’m doing. Some of it (auto code review, auto coverage improvements) is scriptable and can be scheduled to run every night. We’re going to see tools built to solve these problems.
It also seems likely we’ll see the rise of team lead agents (TLAs?) - which can co-ordinate herds of Codexs and Claudes. I’ve not tried yet, but it seems plausible a Claude or Codex could handle at least some of my category two tasks - taking the output from one task, interpreting it and then using it to create the next task. These TLAs can also enforce good discipline (part of my job is to ensure the agents follow good software dev practices).
And with that my role will get pushed further and further up the stack. I’ll get my evenings and weekends back. It feels we’re still someway away from the tools being able to architect autonomously. But given the progress over the past year, who knows where we’ll be in a year.
And once we’ve got swarms of coding agents who can work autonomously 24x7, what does that mean for the poor old human developer?

