International

China proposes blacklist of training data for generative AI models

Published Fri, Oct 13, 2023 · 09:03 AM

CHINA has published proposed security requirements for firms offering services powered by generative artificial intelligence (AI), including a blacklist of sources that cannot be used to train AI models.

Generative AI, popularised by the success of OpenAI’s ChatGPT chatbot, learns how to take actions from past data, and creates new content such as text or images based on that training.

The requirements were published on Wednesday (Oct 11) by the National Information Security Standardization Committee, which includes officials from the Cyberspace Administration of China (CAC), the Ministry of Industry and Information Technology, and the police.

The committee proposes conducting a security assessment of each body of content used to train public-facing generative AI models, with those containing “more than 5 per cent of illegal and harmful information” to be blacklisted.

Such information includes “advocating terrorism” or violence, as well as “overthrowing the socialist system”, “damaging the country’s image”, and “undermining national unity and social stability”.

The draft rules also state that information censored on the Chinese Internet should not be used to train models.

GET BT IN YOUR INBOX DAILY

Start and end each day with the latest news stories and analyses delivered straight to your inbox.

VIEW ALL

Its publication comes just over a month after regulators allowed several Chinese tech firms, including search engine giant Baidu, to launch their generative AI-driven chatbots to the public.

The CAC has since April said it wanted firms to submit security assessments to authorities before launching generative AI-driven services to the public.

In July, the cyberspace regulator published measures governing such services that analysts said were far less onerous than measures outlined in an April draft.

The draft security requirements published on Wednesday require organisations training these AI models to seek the consent of individuals whose personal information, including biometric data, is used for training purposes.

They also lay out detailed guidelines on how to avoid intellectual property violations.

Countries globally are grappling with setting guardrails for the technology. China sees AI as an area in which it wants to rival the US and has set it sights on becoming a world leader in the field by 2030. REUTERS