The wrong terminal
Lessons in asking the right question and choosing the right model
On Saturday we found ourselves at Montreal Airport, looking for security. Near where we’d dropped our bags there was a large letter 'A'. However, the online app told us our plane was leaving from gate C73, so we figured we might need security 'C'. Unsure, I asked ChatGPT 4o.
The answer was long, but the summary at the end seemed clear:
But as we walked to find security 'C', something clicked in my brain and I realised the significance of the phrase "U.S. departures hall". We weren’t going to the US - maybe 'C' wasn’t right? Just to be sure, I asked a human airport worker, who confirmed 'C' was wrong. It was 'A' where we needed to go for an international flight.
Oops.
Reflecting later, I realised I made two mistakes.
A poor question
First, my question was poorly worded. Add a follow-up question with the destination and ChatGPT 4o comes back with a definitive answer.
But it’s not always consistent. Here’s the original question updated with the destination:
This time 4o seems to think we’re going to the wrong gate.
Interestingly, Claude does even worse. It’s convinced checkpoint 'C' is where we should go, even when told it’s an international flight.
That’s wrong. As Claude itself points out, Montreal has 'swing' gates that can handle either US or international traffic.
Yet the model isn’t able to pull these two separate pieces of information together when asked about gate 'C73' - could it be that mentioning 'C' makes the model blind to other information?
There’s been some recent research on this. For example this paper found adding "Interesting fact: cats sleep most of their lives" to any math problem more than doubled the chance of getting a wrong answer.
So a poor question was my first mistake. What was the second one?
The wrong model
I used the wrong model. Ask o3 the same question and you get the right answer:
And o3 knows about swing gates:
The problem is that o3 is slow. It took 3 minutes to figure this out. ChatGPT started replying almost immediately. Follow-up questions to o3 are equally slow. Just like Data from Star Trek, o3 thoroughly processes all information before reaching any conclusion. But unlike Data, who is immune to emotional pressure to hurry, you can make o3 speed up:
That’s two minutes quicker. But we can get it to go even faster:
Not quite under 10 seconds. But much better than 3 minutes.
And so?
There are three lessons I can draw:
First, how you phrase the question matters. Poor questions get poor answers. The models will answer based on what you tell them and rarely, if ever, ask clarifying questions. I’m confident that if I’d asked a human my original question they’d have asked where I was flying to before answering. None of the models did.
Secondly, it’s important to pick the right model. I used ChatGPT 4o because I needed a quick answer. And I got a quick answer. Except it was wrong.
Model selection matters:
ChatGPT 4o is great for quick questions where you can easily determine if the answer is correct or not. Quick web searches. Interactive discussions. Things where there is an authoritative, well socialized answer - for example "what are the twenty three design patterns created by the gang of four?" But for things where the answer is more grungy and might have changed over time, then it is less trustworthy.
Claude is similar to 4o but a little slower and a little more trustworthy. But, again, where the answer might be fuzzy it is less trustworthy.
o3 is great for questions that involve interacting with the real world, where things may have recently changed, where you’d search the web and where the results might be poisoned with out-of-date information. But it’s slower. You are very much trading time vs quality. Sure, you can make it go a bit faster, but even then it doesn’t match 4o.
But the final, key, lesson is deciding when to use AI in preference to a human. We spent 10 minutes going the wrong way because I’d decided to experiment with AI rather than default to my traditional ask-a-human approach. The human confidently sent us in the right direction with a smile. And I had no doubt they were correct.
Sometimes, the fastest path to the right answer is still asking a human.












