Claude in charge
What could possibly go wrong?
At this point everyone knows about AI. Many have even tried it. But the people who actually benefit from it? That's a much smaller group. And a hierarchy is emerging.
There are those who are actively against AI. For whom the mere mention of ‘LLM’ results in angry outbursts on X.
There are those who are ambivalent. They don’t use AI and aren’t interested in exploring it.
Then there are those who use it as a Google search replacement. They ask one question at a time - and don’t normally engage the models in a discussion. Occasionally they might branch out and use a model to create a document.
And then there are the power users - the ones who use AI on a daily basis to create things. The people finding ways to use the tech to give them a significant boost. (Crudely, these are the people who know of Claude.)
Hard to believe but there was a time when most people didn’t know what Google was. Or thought paying for a burger with a credit card was wild.
In reality this behaviour isn't a surprise. It's predicted by the technology adoption curve, where early adopters gain experience and efficiency while others naturally take time to see the practical benefits.
But AI is different. First, the productivity gains may be too significant to ignore for long. And second, in many ways AI adoption is harder than previous tech.
Firstly, the hype cycle of previous years has over-promised and under-delivered. Much of what was promised still hasn’t been delivered. Take Apple Intelligence, which continues to disappoint. Here’s an example my wife got the other day:
This is a summary created from two separate emails from two different people - one was a shipping email about a photobook. The other an email from a friend about a pair of shoes. Apple Intelligence has merrily conflated them. Rather than save time, this cost time to unpick. My wife only keeps Apple Intelligence enabled because I’m interested in watching it evolve. Not because it is useful.
With each passing day, this harms Apple’s brand. It harms the public perception about the usefulness of AI.
And it’s not just Apple. All of the established big tech players have so far failed to deliver genuinely transformative AI. Grok is happy to stir up racism. Gemini - well the less said the better. Shipping undercooked products to a sceptical world does not end well. A lot of damage has been done - public perception is currently in a worse place than it was before the AI hype cycle began.
Secondly there are few paved paths. Learning how to use AI involves experimentation and persistence. It needs guidance. It requires a sense of smell. It’s hard.
To really understand the benefit AI can bring you need to spend time in an area you are familiar with. And it needs to be an area AI can help with. Ethan Mollick advocates for spending 10 hours or so with a frontier model, learning what it can do. I agree.
So what does effective AI use actually look like in practice? Here's a case study...
A case study
January is the time we traditionally book our summer holiday. Typically we have a couple of action packed weeks visiting multiple cities and taking in as many museums, aquariums, science centres and battleships as we can. Fitting everything in requires careful planning - something my wife is an expert at. But this year she decided to see if Claude could help her. Here’s what she found:
Claude is like a cross between a puppy and Albert Einstein. It has a very waggy tail - always enthusiastic and eager to please. But it also knows a lot. About everything.
Claude was genuinely useful - it found a set of interesting museums that we’d have otherwise missed. For example museums located just outside the big cities which didn’t show up in the normal searches.
It proposed imaginative itineraries that took into account the things our family likes to do. It considered the weather to ensure we did outside things in the morning when it was cooler and left inside things to the hotter afternoons.
It was happy to discuss - it’s like having an enthusiastic, always interested, knowledgeable colleague working alongside you.
It can make you feel good - suggest an idea and Claude will often reply with “That’s a great idea”. Although, at times, it can feel rather sycophantic - maybe it’s what being surrounded by “yes-people” is like?
The free version runs out too quickly - even with the paid version she was occasionally running out (in this case Claude tells you to come back in a couple of hours).
But there were some less good things. Planning this holiday couldn’t be achieved in a single chat - there are too many variables and too much iteration required. So multiple chats are required. And that’s where the current UI creaks - there are many simple quality of life improvements that would make a big difference.
As chats get longer they progressively use more tokens (remember everything in the chat up to that point is fed to the model with each subsequent question, so token use is not constant for each question - later ones ‘cost’ more). So Claude has this warning message.
The problem with starting a new chat? You lose all previous context. What’s really needed is a “summarize this chat and move it to a new chat” button. But that doesn’t exist. For now, you have to manually ask Claude to summarize the old chat and then copy and paste to the new one.
Artefacts have recently been updated to be edited in-place. You can now watch Claude delete and add text in real-time. Except it’s buggy. Sometimes Claude will claim to have made updates - and produced a new artefact. There’s one small problem - it hasn’t. The new artefact is identical to the previous one. When this happens the only option is to summarize the chat and start a new one.
Claude’s naming scheme for artefacts is, err, random. Here are some of the names it came up with:
“Key days of trip”
“Final road trip itinerary”
“Modified trip - final itinerary”
“Trip - complete itinerary”
“Trip - final complete itinerary”
And many more.
Tracking versions is something you have to do yourself. And it’s not easy.
Searching old chats is next to impossible. Claude maintains a history of recent chats, but they are only searchable via the title of the chat - which is generated from the first question in the chat. Sometimes the title makes sense, other times less so. There’s a certain irony in that a tool which is so good with text is so poor at dealing with, err, text. Ultimately previous replies probably need to end up in a RAG database and be optionally searchable.
And then there are other problems.
The ‘Claude’ you get varies randomly with time. Sometimes you can have an amazingly productive session. Other times it is a bit, well, meh, and you end up talking at cross purposes. Of course this might be related to the human on the other side of the chat - but it’s something multiple people have mentioned.
Every so often there’s an unhelpful hallucination. For example:
But Claude doesn’t have access to Google Maps… So it can’t check them. But it’s very easy to accidentally believe the hallucination.
Sometimes it panics. For example, in one follow-on chat Claude reviewed the plan (which it had previously created) and reported:
The problem was Claude had mixed up a couple of museums. So this turned out to be a false alarm. But it was interesting to see Claude panic - not something I’ve seen before.
It fought hard to have us stay in an airport hotel to recover after our flight. Even though it was significantly more expensive and left us with more driving the following day. This felt like the model safeguarding seeping through.
It isn’t honest about the training data cut off date and will happily hallucinate hotels, opening times, prices, traffic and more. You need to tread to carefully when asking for detailed information. It is good for the high-level big picture, but poor at detail.
It never tells me, “you know what, there aren’t any good options”. It always wants to please and find something. Sometimes a bit more brutal honesty would be appreciated.
At one point it offered to draw a map. There was a little disappointment when Claude produced something that looked like it had been drawn on the back of a napkin (see below) - rather than the Google Maps my wife was expecting!
So what?
Don’t get me wrong. Claude was immensely useful. It transformed holiday planning from tedious Google searching into an engaging collaborative experience. It felt like working with a knowledgeable, endlessly enthusiastic colleague who never got tired or grumpy. Yes, the UI is clunky. Yes, it occasionally hallucinates. And yes, sometimes it panics about problems that don't exist. But it found us places we'd have missed, saved countless hours of searching, and made the whole process enjoyable rather than a chore.
And that's the thing about AI adoption. It's not like previous tech where the benefits are obvious from day one. You need to spend time with it. Learn its quirks. Figure out how to work with it effectively. Those who do gain a significant edge. Those who don't... well, they'll probably continue complaining about it on X.
So that's our summer sorted. Assuming Claude hasn't hallucinated any of the museums. And all the cities actually exist.
If I disappear over the summer you’ll know the reason why :-).






