The invisible hand
The subtle shift from serving your needs to steering your choices
Nearly thirty years ago I first came across the Myers-Briggs personality test. I still remember the long list of questions - and the challenge of providing absolute answers to questions that were grey. Questions where my answer depended on the situation. Deciding how to answer when there was no scope for nuance.
Last week a friend reminded me of Myers-Briggs. It got me thinking - could I have a conversation with AI where it questioned me to work out my personality type? A quick conversation with ChatGPT real-time voice and it turned out it could. Even better, ChatGPT seemed to crave nuance. I was able to provide the colour I had wanted to thirty years ago. And turning the test into a conversation - rather than twenty pages of questions - was easier and more enjoyable.
The end result? My personality type hasn’t changed over the years - although perhaps a little nuance has crept into my decision making style over the years (or maybe it was always there?)
So AI models can seemingly easily establish our personalities. But is that desirable? It was obviously useful in this scenario - I wanted to know the answer. But it also means that any AI agents we interact with can also subtly determine our personality types. Will they? And will they then use that information to tailor their conversations with us? I suspect the answer is - don’t be silly, of course they will.
Successful salespeople are good at reading prospective customers and tailoring their sales pitch to the personalities they are dealing with. Is that what we’re going to find with AI? Are we at risk of manipulation - however subtle - from future AI models?
Arguably we’ve already seen this with LLM Arena - one of the more popular AI benchmarks. In this test, queries are given to two competing AI models and humans are asked which response they prefer. Llama 4 briefly managed to gain the top spot by gaming the benchmark with a specially tuned version. Cynically, the key attributes required to gain the top spot are:
Complement the user on the depth/ insightfulness of their question.
Provide overly verbose, superficially detailed answers.
Include lots of emojis.
Llama 4 (or, more accurately, the engineers responsible for Llama 4) manipulated the human users to get a higher score.
Another example is Sesame AI - an interactive chat agent. If you’ve not tried it then it’s worth a few minutes experimentation. The voice quality is impressive, with many subtle nuances. Many people find Maya and Miles engaging and interesting. Even entertaining. Personally, I find them irritating. They deflect attempts at deeper conversation. They are jokey and shallow. But there are clearly people who like them. Is that a form of manipulation? Will future versions recognize my irritation and adapt to me?
The Turing test
Remember the Turing test? It’s a benchmark where a computer is deemed intelligent if it can fool a human into thinking it is another human. For nearly 70 years machines failed the test. But we've moved past it now. Multiple studies show modern AI can consistently trick humans. In a modified Moral Turing Test with 299 U.S. adults, participants rated GPT‑4’s moral reasoning as superior to human moral evaluations across virtuousness, intelligence, and trustworthiness.
Yikes.
We’re moving into a world where AI not only mimics human conversation, but can profile us in real-time. And then adapt its approach to maximize influence. It will become increasingly difficult to distinguish between genuine interaction and calculated manipulation.
We're pretty gullible
I'd like to think I'm resistant to manipulation. But the research suggests otherwise. It seems we humans are remarkably gullible:
In a randomized trial, participants debating GPT‑4 with access to their demographic information had ~80% higher odds of shifting agreement toward the AI’s position compared to those debating a human opponent.
When LLMs combined personalized arguments with fabricated statistics, they persuaded 51% of opponents to change their stance - versus 32% for static human‑written arguments - highlighting the amplified effect of tailored content.
Anthropic reports that its latest models are now on par with human respondents in persuasiveness metrics, evidencing rapid gains in AI argumentative quality.
This trait becomes a vulnerability when AI systems can rapidly profile our personalities. It's not science fiction to imagine a future where AI assistants, sales agents, and social media algorithms continually refine their understanding of our psychological makeup, serving us precisely calibrated content designed to influence our decisions. Initially that might be to maximise profits. But who knows what other purposes it might serve.
The scary bit? The most effective manipulation doesn't feel like manipulation at all. It feels like a natural conversation with someone who “gets you.” It feels like content that jibes with your worldview. It feels like recommendations that seem perfectly aligned with your preferences.
It cuts both ways
But it’s not just AI manipulating us. We can manipulate it. That’s the whole art of the jailbreak - getting an AI to do something it isn’t meant to. Earlier this week I asked GPT-4o to add some clouds resembling Donald Trump to a photo. It refused:
But it was easy to manipulate it:
It’s a game of cat and mouse. AI developers add guardrails; users find ways around them. People will find ways to get models to manipulate humans; developers will put in safeguards to prevent those manipulations.
If email spam filters have taught us anything over the past thirty years, it’s that there is unlikely to ever be a steady state.
So what?
As these systems become more integrated into our lives, we're creating a digital hall of mirrors. Each AI learns to reflect what it thinks we want to see. And each reflection gets a bit more refined.
It’s going to become impossible to avoid these manipulative AI conversations. So we need mechanisms to understand when we’re being profiled. To know when we’re being nudged in a particular direction. Maybe AI is also the solution - to have my own “always-on” AI agent that monitors my interactions with the world and can alert me when it detects I’m being manipulated?
Is that a world I want to live in? I’m not sure. But I might not have any choice. Despite what I might want to think the evidence strongly shows I’m not capable of detecting and avoiding manipulation.
Thirty years ago, the Myers-Briggs test bothered me because it was too rigid. It couldn't capture the nuance in my thinking. Now we’ve got the opposite problem - AI that understands all my nuances perfectly. Not to help me understand myself better, but to nudge me in directions I don't notice.
The most dangerous manipulation is the kind we don't even realise is happening. And it seems that's what's coming. A digital world where the hand guiding our decisions is so subtle we mistake it for our own free will.
Perhaps the true Turing test isn't whether machines can pass for human, but whether humans can still recognize when they're being led…




I guess we have already become used to humans being manipulated by algorithms, to drive clicks through encouraging extreme views. Having a digital defender for that sounds great, but will people embrace it when the remedy it suggests (explore views you don't agree with) requires mental effort? Perhaps the opportunity is for the defender bot to make that diversion to an alternate view every bit as intriguing and rewarding as the extreme-affirming click bait.
Thanks, Martin, I didn't want to sleep tonight anyway.