The new Claude 3.5
What does it mean for Microsoft? And why can't AI companies do versioning?
This week Anthropic released a new version of Claude (model card here). It seems to be a significant step forward with 5-10% improvements across the board. Intriguingly it introduced a new "computer use" feature which uses screenshots and pixel counting to control the mouse and "drive" a computer. For now it's a bit basic - the Anthropic demos show promise but humans still have to provide a lot of guiding.
But it starts to prise open the door to conversational AI driven UIs on the PC. The future seems highly likely to be one where we talk to our AI helpers (or agents) and get them to complete tasks rather than push keys and wiggle mice. Talk to your phone and it sorts out the rest. Whether that’s figuring out how to get a good view of the Golden Gate Bridge at sunrise, or organizing our calendars, or negotiating world peace.
The neat part of Claude’s new Computer Use feature is it doesn’t require a new OS, or changes to any of our applications. It just uses the same, human optimized, interfaces we already have. Imagine it integrated into the “brain” of a robot - that robot suddenly has access to all the computer based tools that you or I already have. And that includes everything we’ve built over the past 50 years. It can drive a 1970s Apple II just as well as the latest MacBook Pro.
And this principle of AI using human interfaces extends far beyond desktop computing. Consider self-driving cars. Rather than build dedicated hardware with radar and optical sensors (the current approach), imagine a world where a personal robot assistant drives for you. We already know that vision and audio are sufficient for driving - that’s what we humans use. So it must be possible. And such a robot would be instantly compatible with all the cars we’ve ever made. And planes. And trains. And diggers, tractors, motorbikes, scooters… OK - I’ll stop.
Making general AI that can interface with our existing world is vastly more powerful than adding AI to every application that already exists. In time (and we might be only talking 6-12 months) Claude’s Computer Use feature will add AI support to every application that’s every been made.
Consider Microsoft. This is bad news. It means every version of Word, Excel, Powerpoint that’s ever been made will become AI enabled. Why pay to upgrade Office to a newer version when you can retrofit across the board? There’s an increasing risk Microsoft Copilot becomes irrelevant.
Longer term what role will the PC play? Many day-to-day tasks (online banking, shopping etc) are at least as easy on a phone as a PC. Conversational UIs will improve those interfaces. Plus any future agent on my phone will have access to a lot more personal data - and join it up to provide more utility. We might have reservations about giving it access to our data - but I expect many (most?) of us will trade the loss of privacy against the increase in utility, especially if the data is only processed by a model local to your device. In this world where the phone becomes even more dominant Apple and Google are the winners. The PC market, on the other hand, continues to shrink.
Of course, Microsoft isn’t just Windows and Office these days. Azure seems likely to remain one of the dominant cloud providers. But lacking a foundation model of its own, and the relationship with OpenAI seeming to sour will Microsoft become little more than an infrastructure provider?
There are historical precedents. I’m reminded of Sun Microsystems. Sun gave us Java, helped create Unix, pioneered open source and drove the internet boom. They coined the phrase 'the network is the computer' decades before cloud computing. But the company that predicted the future couldn't survive to see it. Sun’s story was one where being right too early was just as dangerous as being wrong.
Take open source. Sun predicted software would be free and companies would pay for support and services. They spent large sums and much energy explaining their vision to a skeptical market. But by the time they open-sourced their OS, Solaris, they were too late to compete with Linux. If they had got the timing right, it’s entirely conceivable Solaris would be the OS of the datacenter and Linux would be a footnote in history.
Microsoft spotted the potential of AI. And moved first and fast. But adding AI to Windows and Office won’t be enough - something more radical is needed. And there may not be an answer as Windows becomes increasingly irrelevant.
But returning to Claude. The last version was 3.5. What’s the new version number? Inexplicably it’s also 3.5. Yup, the same as the old version. Even Claude thinks this is a bad idea:
I recommend against keeping version 3.5 since you've made significant changes including a new PC control feature and 5-10% performance improvements. Using the same version number could confuse users and make it harder to track capabilities and issues. Following industry practice (like GPT-3 to GPT-3.5), a version change would better communicate these improvements and maintain transparency with your users.
This follows on the heels of OpenAI spawning a new versioning scheme with o1 and Gemini’s confusing naming (Gemini-Pro → Gemini-1.5-Pro → Gemini-1.5-Pro-002).
I’m left wondering: why don’t AI labs use their models to help them name their products?

