Discussion about this post

User's avatar
Piers Finlayson's avatar

I’ve been pretty underwhelmed with Claude 3.7, and switch back to 3.5 most of the time. I find that Claude 3.7 is less likely to predict what I want correctly, from the (often terse) briefings I give it, it has substantially worse memory (forgetting during conversations what I’ve previously told it, with me having to remind it) and it seems more arrogant. To me it feels rushed out, and I suspect Anthropic are doing something “clever” to reduce context wherever possible. This context/memory issue is the biggest one for me. It often feels like I’ve started a new conversation, when I’m just the 2nd or 3rd response into the existing one.

I’m also pretty underwhelmed with AI in general with my current use-case - helping me build embedded rust applications with embassy. To be fair, embassy is relatively new and evolving, and maybe if I fed the full, current embassy docs into Claude it’d be better. But I’m not convinced - at least with 3.7 - as when I do provide it with some docs, it “forgets” and starts making up its own stuff again.

And this leads me to my currently thinking about AI - it can be crap off the golden path. If you’re doing something esoteric or unusual it’s not great. And that’s where human value remains. I believe it behooves us (always, not just because of AI coming to eat our lunch) to add the unique value we can - this is just one more case.

I also wonder how the big leap up the value chain you’ve talk about in some of your posts is going to happen with AI. I agree with you that AI today is like an eager relatively junior SWE - just one with encyclopaedic knowledge in some areas, and the able to product code really fast. That’s presumably because there’s literally tons of code out there it’s been trained on. But a lot of it isn’t good code - and therefore it generates code that is not stuff I’d want to maintain long-term. I often find myself completely rewriting (with its help) code it’s generated from scratch, because while it might work, or at least point me to something that’ll work, it’s just not something I want to live with. That’s OK for a one-off tool, but not for, say, a multi-million subscriber telephone system, or hospital back-end, or … anyway, I digress. What I was getting to was where is the great code for the better AIs to be trained on going to come from? Where are the design docs and architecture docs? I expect there are some out there, but I’ve always been disappointed by what’s available with any open source project I look at.

Or maybe I just want to remain better than AI at coding :-).

1 more comment...

No posts

Ready for more?