China Issues National Standards on AI-Generated Synthetic Content for Comment
Published 25 September 2024
Yu Du
On 14 September 2024, the Cyberspace Administration of China (CAC) released the draft Measures for Identifying AI-Generated Synthetic Content (Draft Measures), inviting public comments until 14 October 2024. Concurrently, a supporting draft mandatory national standard titled Cybersecurity Technology: Methods for Identifying AI-Generated Synthetic Content (Draft Standards) was also made available, with the deadline for feedback set for 13 November 2024.
The purpose of these draft measures and standards is to address the challenges posed by the rapid development of AI-generated content, such as text, images, audio, and video. AI technologies, with their ability to produce realistic synthetic content, raise concerns about misinformation, fraud, and national security risks. China aims to establish mechanisms to clearly identify such content, ensuring transparency and maintaining cybersecurity while protecting public interests.
[Key Provisions of the Draft Measures]
The Draft Measures put forward a clear requirement that both explicit and implicit markers must be applied to AI-generated content. The provisions apply to organizations and individuals that provide AI-based services, generate or synthesize content using AI, or distribute such content on various platforms. These rules do not apply to organizations that develop AI technologies but do not provide services to the domestic public.
1. Definition of AI-Generated Content
AI-generated content includes any information created or synthesized using AI, such as text, images, audio, and video. The measures require both explicit and implicit labeling of AI-generated content - Explicit labels are visible to users and presented in the form of text, audio, graphics, and so one, while implicit labels are embedded within the data of the AI-generated content and not easily detectable by users.
2. Explicit Labeling Requirements (Article 4)
Service providers must add explicit labels to AI-generated content in specific ways:
1) Text - Labels must be added at appropriate positions (beginning, middle, or end) or in the user interface.2) Audio - Voice prompts or rhythm indicators should be included at suitable points (beginning, middle, or end).3) Images - Clear labels must be placed in appropriate areas.4) Videos - Prominent labels should be added at the start and during the video playback.5) Virtual Scenarios - Labels should be added at the start and throughout the virtual environment.
3. Implicit Labeling Requirements
Providers must embed metadata in AI-generated content files, including details like the content’s attributes, provider name or code, and content identification number. They are encouraged to use digital watermarks as implicit labels.
4. Content Dissemination Platforms’ Responsibilities (Article 6)
Content dissemination platforms must take measures to regulate AI-generated content:
1) Check for implicit labels in the file’s metadata. If found, add clear visible labels to inform users that the content is AI-generated.2) If no implicit label is detected, but the user claims the content is AI-generated, the platform must still add visible labels.3) If neither implicit labels nor user declarations are found, but traces of AI generation are detected, the platform should add a “suspected AI-generated” label.4) For confirmed or suspected AI-generated content, platforms must add relevant metadata.
5. Platform and Service Provider Obligations
Service providers must inform users of the methods and styles of AI content labeling in their service agreements. They must retain logs of unmarked AI-generated content for at least six months and should not delete or alter labels maliciously.
Key Points in the Draft Standards
The Draft Standards go into great depth regarding the technical aspects of the identification methods for AI-generated content. These standards lay out specific requirements for both explicit and implicit markers, ensuring that these identifiers are properly implemented across various forms of media.
1. Technical Requirements
The methods outlined must reliably identify AI-generated content through various techniques, such as metadata analysis, visual or audio markers, or embedded digital signatures. These markers are intended to trace back to the source or the AI model used in generating the content, ensuring accountability.
2. Implementation Guidelines
Platforms and content providers must implement automated systems capable of detecting and labeling AI-generated content. Such systems should be designed to function across different formats, including text, image, audio, and video content. The detection and labeling process should be seamless, ensuring that AI-generated content is always appropriately marked, regardless of the platform where it is shared.
3. Mandatory Labeling Specifications
AI-generated content must be labeled using standardized terms like “Generated by AI” or “AI-Synthesized”. These labels should remain intact and clearly visible, even if the content is modified or shared across platforms. This ensures that users are informed about the origin of the content at all times, promoting transparency and reducing the risk of deception.
[Comment]
These draft measures and standards reflect China’s growing attention to AI-generated content and the potential risks associated with it. By setting clear rules and guidelines for identifying such content, the government aims to ensure that users can easily distinguish between human-created and AI-generated materials. This transparency is essential in preventing the spread of misinformation and maintaining trust in online content. At the same time, these measures encourage the responsible development and use of AI technologies, while mitigating potential risks to society. The introduction of both explicit and implicit markers for AI-generated content is a practical step. We will closely follow how these measures and standards are applied in practice, as they will play a crucial role in shaping the future of AI governance in China.
The purpose of these draft measures and standards is to address the challenges posed by the rapid development of AI-generated content, such as text, images, audio, and video. AI technologies, with their ability to produce realistic synthetic content, raise concerns about misinformation, fraud, and national security risks. China aims to establish mechanisms to clearly identify such content, ensuring transparency and maintaining cybersecurity while protecting public interests.
[Key Provisions of the Draft Measures]
The Draft Measures put forward a clear requirement that both explicit and implicit markers must be applied to AI-generated content. The provisions apply to organizations and individuals that provide AI-based services, generate or synthesize content using AI, or distribute such content on various platforms. These rules do not apply to organizations that develop AI technologies but do not provide services to the domestic public.
1. Definition of AI-Generated Content
AI-generated content includes any information created or synthesized using AI, such as text, images, audio, and video. The measures require both explicit and implicit labeling of AI-generated content - Explicit labels are visible to users and presented in the form of text, audio, graphics, and so one, while implicit labels are embedded within the data of the AI-generated content and not easily detectable by users.
2. Explicit Labeling Requirements (Article 4)
Service providers must add explicit labels to AI-generated content in specific ways:
1) Text - Labels must be added at appropriate positions (beginning, middle, or end) or in the user interface.2) Audio - Voice prompts or rhythm indicators should be included at suitable points (beginning, middle, or end).3) Images - Clear labels must be placed in appropriate areas.4) Videos - Prominent labels should be added at the start and during the video playback.5) Virtual Scenarios - Labels should be added at the start and throughout the virtual environment.
3. Implicit Labeling Requirements
Providers must embed metadata in AI-generated content files, including details like the content’s attributes, provider name or code, and content identification number. They are encouraged to use digital watermarks as implicit labels.
4. Content Dissemination Platforms’ Responsibilities (Article 6)
Content dissemination platforms must take measures to regulate AI-generated content:
1) Check for implicit labels in the file’s metadata. If found, add clear visible labels to inform users that the content is AI-generated.2) If no implicit label is detected, but the user claims the content is AI-generated, the platform must still add visible labels.3) If neither implicit labels nor user declarations are found, but traces of AI generation are detected, the platform should add a “suspected AI-generated” label.4) For confirmed or suspected AI-generated content, platforms must add relevant metadata.
5. Platform and Service Provider Obligations
Service providers must inform users of the methods and styles of AI content labeling in their service agreements. They must retain logs of unmarked AI-generated content for at least six months and should not delete or alter labels maliciously.
Key Points in the Draft Standards
The Draft Standards go into great depth regarding the technical aspects of the identification methods for AI-generated content. These standards lay out specific requirements for both explicit and implicit markers, ensuring that these identifiers are properly implemented across various forms of media.
1. Technical Requirements
The methods outlined must reliably identify AI-generated content through various techniques, such as metadata analysis, visual or audio markers, or embedded digital signatures. These markers are intended to trace back to the source or the AI model used in generating the content, ensuring accountability.
2. Implementation Guidelines
Platforms and content providers must implement automated systems capable of detecting and labeling AI-generated content. Such systems should be designed to function across different formats, including text, image, audio, and video content. The detection and labeling process should be seamless, ensuring that AI-generated content is always appropriately marked, regardless of the platform where it is shared.
3. Mandatory Labeling Specifications
AI-generated content must be labeled using standardized terms like “Generated by AI” or “AI-Synthesized”. These labels should remain intact and clearly visible, even if the content is modified or shared across platforms. This ensures that users are informed about the origin of the content at all times, promoting transparency and reducing the risk of deception.
[Comment]
These draft measures and standards reflect China’s growing attention to AI-generated content and the potential risks associated with it. By setting clear rules and guidelines for identifying such content, the government aims to ensure that users can easily distinguish between human-created and AI-generated materials. This transparency is essential in preventing the spread of misinformation and maintaining trust in online content. At the same time, these measures encourage the responsible development and use of AI technologies, while mitigating potential risks to society. The introduction of both explicit and implicit markers for AI-generated content is a practical step. We will closely follow how these measures and standards are applied in practice, as they will play a crucial role in shaping the future of AI governance in China.