Opinion & Features

Did a robot write this? The need for watermarks to spot AI

Published Mon, Dec 12, 2022 · 03:53 PM

Parmy Olson

A TALENTED scribe with stunning creative abilities is having a sensational debut.

ChatGPT, a text-generation system from San Francisco-based OpenAI, has been writing essays, screenplays and limericks after its recent release to the public. These are usually produced in seconds, and often to a high standard. Even its jokes can be funny. Many scientists in the field of artificial intelligence (AI) have marvelled at how humanlike it sounds.

And, remarkably, it will soon get better. OpenAI is widely expected to release its next iteration of the system, known as GPT-4, in the coming months; early testers say it is better than anything before.

But all these improvements come with a price. The better AI gets, the harder it will be to distinguish between human and machine-made text. OpenAI needs to prioritise labelling the work of machines, or we could soon be overwhelmed with a confusing mishmash of real and fake information online.

For now, the company is putting the onus of honesty on users. Its policy for ChatGPT states that users should clearly indicate AI-generated content “in a way that no reader could possibly miss” or misunderstand.

GET BT IN YOUR INBOX DAILY

Start and end each day with the latest news stories and analyses delivered straight to your inbox.

VIEW ALL

To that, I say, “good luck”.

AI will almost certainly kill college essays. A student in New Zealand has already admitted that they used the technology to help boost their grades. Governments will use it to flood social networks with propaganda, spammers to write fake Amazon reviews and ransomware gangs to write more convincing phishing emails.

None of these will point to the machine behind the curtain.

And you will just have to take my word for it when I say this column was fully drafted by a human, too.

AI-generated text desperately needs some kind of watermark, similar to how stock photo companies protect their images, and how film studios deter piracy. OpenAI already has a method for flagging another content-generating tool, Dall-E, with an embedded signature in each image it generates. But it is much harder to track the provenance of text. How do you put a secret, hard-to-remove label on words?

The most promising approach is cryptography.

In a guest lecture last month at the University of Texas at Austin, OpenAI research scientist Scott Aaronson gave a rare glimpse into how the company might distinguish text generated by the even more humanlike GPT-4 tool.

Aaronson, who was hired by OpenAI this year to tackle the issue of provenance, explained that words could be converted into a string of tokens, representing punctuation marks, letters or parts of words. These strings could make up about 100,000 tokens in total. He said that the GPT system could then arrange the tokens to reflect the text, making them detectable using a cryptographic key known only to OpenAI.

“This won’t make any detectable difference to the end user,” he said, adding that he had a working prototype.

He also said that anyone who uses a GPT tool would find it hard to scrub this watermark, even by rearranging the words or removing punctuation. The best way to defeat it would be to paraphrase the output using another AI system. But that takes effort, and not everyone would do that.

Still, assuming Aaronson’s method works outside a lab, OpenAI still has a quandary. Does it release the watermark keys to the public, or hold them privately?

If the keys are made public, professors everywhere could check their students’ essays to make sure they are not machine-generated, in the same way that many do now to check for plagiarism. But that would also make it possible for bad actors to detect the watermark and remove it.

Keeping the keys private, meanwhile, creates a potentially powerful business model for OpenAI: charging people for access. IT administrators could pay a subscription to scan incoming e-mail for phishing attacks, while colleges could pay a group fee for their professors – and the price to use the tool would have to be high enough to put off ransomware gangs and propaganda writers. OpenAI could, essentially, make money from halting the misuse of its own creation.

We also should bear in mind that technology companies do not have the best track record for preventing misuse of their systems, especially when they are unregulated and profit-driven. (OpenAI says it is a hybrid for-profit and non-profit company that will cap its future income.) But the strict filters that OpenAI has already put in place to stop its text and image tools from generating offensive content are a good start.

Now, OpenAI needs to prioritise a watermarking system for its text. Our future looks set to become awash with machine-generated information. This will come not just from OpenAI’s increasingly popular tools, but from a broader rise in “synthetic” data, used to train AI models and replace human-made data. Images, videos, music and more will increasingly be artificially-generated, to suit our hyper-personalised tastes.

It is possible, of course, that our future selves will not care if a catchy song or cartoon originated from AI. Human values change over time. For instance, we care much less now about memorising facts and driving directions than we did 20 years ago. At some point, watermarks might not seem so necessary.

But for now, with tangible value placed on human ingenuity that others pay for, or grade, and with the near certainty that OpenAI’s tools will be misused, we need to know where the human brain stops and machines begin. A watermark would be a good start. BLOOMBERG