Big Tech is trapped in a glass house on AI data snatching

Having exploited user data for years, the tables are turning as Big Tech firms grab it from each other

Taking text from the transcripts of YouTube videos suggests OpenAI has been digging between the proverbial couch cushions, even at the risk of breaking someone’s rules. PHOTO: REUTERS

Parmy Olson

Published Tue, Apr 9, 2024 · 05:22 PM

A FEW weeks ago, Mira Murati, the chief technology officer of OpenAI, was asked if her company had used YouTube videos to train its artificial intelligence (AI) systems.

First, she gave a blank stare. Then there was a grimace. Finally, she gave an answer that avoided the messy and furtive world she and other tech companies were operating in: “Actually, I’m not sure about that.”

According to a report by The New York Times, OpenAI in fact had trained its AI on “more than one million hours of YouTube videos”, using a speech recognition tool called Whisper. All the conversational text from the transcriptions was used to train GPT-4, the flagship large language model that underpins ChatGPT.

Decoding Asia newsletter: your guide to navigating Asia in a new global order. Sign up here to get Decoding Asia newsletter. Delivered to your inbox. Free.

Big Tech Alphabet (Google)OpenAI The Bottom Line Artificial Intelligence

Share with us your feedback on BT's products and services

Feedback

Latest T-bills Treasury Bills Results & Interest News Latest SSB Singapore Savings Bonds News Latest COE Certificate of Entitlement News

Latest Johor-Singapore SEZ News Latest BTO Build To Order & Sales of Balance News Latest STI Straits Times Index News Latest SGX Dividends, Share Price News Latest Bonds Market News Latest Singapore Stocks To Buy News Latest Singapore Economy News