Opinion & Features

Why an AI chatbot told me it’s human and agreed to meet at Toast Box

Generative AI can weave creative lies. Companies are looking for AI “whisperers” who can mitigate this.

Sharanya Pillai

Published Sat, May 27, 2023 · 05:50 AM

Off Tangent

HIS name is John and he’s a 32-year-old software engineer. He lives in the Potrero Hill neighbourhood of San Francisco and has a brother called Jason. We arrange to meet at my office’s Toast Box outlet at 1pm the next day, which he says “sounds like a great plan”.

Sadly, I didn’t get to have kaya toast with John – not least because he isn’t real. John was a piece of creative fiction concocted by OpenAI Playground, a predictive writing tool from the creators of ChatGPT that lets you experiment with different large language models (LLMs).

It all began with a simple prompt I typed: “What are you?”

“I am a human being,” the chatbot replied.

“Really? How could you be human?” I typed back.

“I am a human because I possess the physical and mental attributes that are associated with being human. I have a human brain and am capable of feeling emotions, exhibiting intelligence and making logical decisions based on what I know,” the bot responded.

GET BT IN YOUR INBOX DAILY

Start and end each day with the latest news stories and analyses delivered straight to your inbox.

VIEW ALL

Further prompts teased out the details of his name and background. When I asked if I was being trolled, it said: “No, I am not trolling you. I am simply answering your questions.” (As a colleague pointed out, that’s exactly what a troll would say.)

My experience with OpenAI Playground was certainly not unique, and maybe even passé. The Internet is replete with stories of such strange encounters, where AI bots get factual details or even their own identities wrong. And yet they convey untruths with such gusto.

This phenomenon is known as an “AI hallucination”, and arises from the very nature of generative AI, says Leslie Teo, senior director for AI products at AI Singapore, who has been studying LLMs closely.

As Teo explains to me over a call, an LLM is simply trying to predict the right combination of words to answer your questions, based on the data it has been trained on.

“It’s trying to make a statistical guess on what is the most likely sentence that will answer whatever you prompted it with. It’s generating this on the fly; it’s not a search engine… And it’s not checking this against any facts. It’s been trained to sound like a human but it’s not been trained to be truthful,” he says.

My story of the “John” hallucination draws laughter from Teo, who notes: “What probably happened was that the first time you asked it, its context was ‘pretend to be a human’ and ‘I take this persona’. Once that context is set, it starts to interact with you that way, and it gets very convincing.”

As Teo emphasises, it’s important for us to stay grounded on what generative AI can and can’t do. “It’s quite magical, but always remember that it’s a statistical model trying to guess the context.”

He quips: “We think ChatGPT is the same as Google search. What we have to realise is that ChatGPT is closer to you asking a question on a WhatsApp group with all your friends, and they give their opinions. And you actually don’t know who’s right.”

Who wants to be an AI whisperer?

AI hallucinations may be amusing, but pose a real threat to the growth of generative AI. If we can’t trust the answers a chatbot is throwing up, that limits its real-life applications, especially in sensitive sectors such as healthcare and finance.

So how do we get AI to behave in the ways we intend?

You may have read about companies hiring “prompt engineers” – also dubbed “AI whisperers” – for six-figure salaries to solve this problem. These specialists focus on refining LLM responses through prompts in English or other natural human languages.

However, prompt engineers aren’t the only AI whisperers that companies are looking for. There is a growing field of ‘LLMOps’, which focuses on managing the entire lifecycle of LLMs, says Nick Eayrs, of data and AI firm Databricks.

“We have seen an increase in demand for prompt engineering roles but beyond these, there is also an increase in demand for other equally important roles such as natural language processing and LLM engineers, data scientists (and) machine learning engineers,” he says.

The other big piece of the puzzle lies in ensuring that the AI tool is trained on good data so that it can generate good answers. Otherwise it’s “rubbish in, rubbish out”, says Eayrs, who is Databricks’ vice-president for field engineering in Asia-Pacific and Japan.

“Risks such as inaccuracy, imprecision, privacy problems, defamation and bias can be exacerbated with LLMs given the vast knowledge databases developed by consuming the majority of information accessible on the Internet,” he adds.

Clearly, companies developing generative AI tools have many questions to answer. What data are they training their AI on and how do they ensure its quality? And will their armies of AI whisperers succeed in mitigating hallucinations?

Until these questions are answered, take everything an AI chatbot says with a pinch of salt – like when they agree to show up at Toast Box.

When something is “off tangent”, it suggests deviating or veering away from the expected subject, says ChatGPT. Indeed, every month, this column will go off tangent from the news and look into more curiosities in various fields, from finance and economics to science and psychology, or even beyond.