Hello, this is Claude
Teaching Claude to phone me when it gets stuck
These days I’ve got a new morning routine. Get up at 6. Check-in with the AI agents on overnight progress. Unblock anything that’s stuck. Kick off the next set of tasks. Then, once they are all busy, I head out for a walk in the hills. And, a few hours later, I’m back checking on the agents again.
But sometimes the agents get stuck while I’m away. They sit patiently waiting for further instructions. If only they could ask for help…
And then I had an idea. What if the agents could call me? Could they give me a quick update? Could I unblock them and set them on their way?
In theory I can do this with remote desktop and a VPN. Or Claude-Code-on-the-web. But phone screens don’t work well when it’s raining. And it’s almost always raining at this time of year. So a voice interface would be better.
The initial architecture was simple. Three new skills plus a MCP server plus OpenAI Whisper (for text-to-speech and speech-to-text). And a Twilio account to connect to the phone network.
But then I looked at the costs. 8p a minute. Of which 6.7p was the phone call. Hmm. I can’t avoid text-to-speech or speech-to-text. But I’ve already got a data connection on my phone. Couldn’t I use that?
I briefly considered building a proprietary client or WebRTC. Claude also suggested a Signal bot. But the ideal would be to use SIP/RTP. Those are the standard protocols behind the telephony network. Protocols I’m familiar with. "All" I needed was a SIP softphone on my phone plus a SIP/RTP stack for my PC.
Finding a SIP/RTP stack
But where to find a SIP/RTP stack? Ideally one built in Rust?
Last September I experimented building a Rust wrapper around baresip, a well known open-source SIP/RTP stack built in C. But it didn’t work very well, mostly because both baresip and Rust fought over owning the event loop. I parked the idea, concluding it was beyond current models.
One day I shared the story with a colleague; they asked why I wasn’t just building the stack from scratch. What a daft idea I thought as I heard the words "great idea" coming from my mouth.
But a few months later, models had moved on. We had Sonnet 4.5, Codex 5.1. So I kicked it off. Downloaded the SIP/RTP specs. Iterated on the design with the models. And set them off building. It was also a chance to play with the ralph-wiggum plugin - an agentic loop that keeps iterating until it succeeds. So Claude was able to work all night provided the task was well specified.
Partway through Opus 4.5 appeared and joined in. And soon I had three SIP/RTP stacks in various states of completion. Could I use any of them?
After a bit of work, we’d merged the three stacks together into one, added end-to-end testing (using Asterisk - an open source SIP/RTP PBX), taken the code and branch coverage to 100%, created a "callme" MCP server and put together the necessary skills.
And then I tried it. And it worked.
Wow.
And so?
There’s something odd about getting a call from Claude while I’m halfway up a hill. I’m not sure I like being at the beck and call of an AI. Although it is useful to keep things moving while I’m not around. Plus it’s not like humans. If I ignore the call, Claude doesn’t mind.
But the rate of change is scary. Twelve months ago this was impossible. Six months ago I fumbled trying to wrap an existing C library. And now?
Fifteen years ago SIP/RTP stacks were big complicated things that took teams of 20+ people years to build. Did Claude and co really just build a working one?
I’m sure it is imperfect in many ways. I’ve no idea how performant it is. Or how scalable. Or how resilient. But for my use case it works fine. And that’s good enough. Claude can now call me whenever it wants. All through the day. It’s already getting irritating.
I spent fifteen years being on-call for production systems. And now I’m on-call for Claude.

