QUICK REVIEW

[论文解读] Security and Privacy on Generative Data in AIGC: A Survey

Tao Wang, Yushu Zhang|arXiv (Cornell University)|Sep 18, 2023

Privacy-Preserving Technologies in Data被引用 7

一句话总结

本综述从信息安全属性——隐私、可控性、真实性和合规性，分析生成式数据在 AIGC 中的安全与隐私，并评估最先进的对策。

ABSTRACT

The advent of artificial intelligence-generated content (AIGC) represents a pivotal moment in the evolution of information technology. With AIGC, it can be effortless to generate high-quality data that is challenging for the public to distinguish. Nevertheless, the proliferation of generative data across cyberspace brings security and privacy issues, including privacy leakages of individuals and media forgery for fraudulent purposes. Consequently, both academia and industry begin to emphasize the trustworthiness of generative data, successively providing a series of countermeasures for security and privacy. In this survey, we systematically review the security and privacy on generative data in AIGC, particularly for the first time analyzing them from the perspective of information security properties. Specifically, we reveal the successful experiences of state-of-the-art countermeasures in terms of the foundational properties of privacy, controllability, authenticity, and compliance, respectively. Finally, we show some representative benchmarks, present a statistical analysis, and summarize the potential exploration directions from each of theses properties.

研究动机与目标

评估生成式数据对真实数据隐私的影响并识别 AIGC 的隐私威胁。
检查可控性机制，以防止生成式数据的误用和版权问题。
评估真实性与检测/归因方法，以验证生成式数据。
分析监管/合规要求并为可信任的生成数据提出指南。

提出的方法

按信息安全属性（隐私、可控性、真实性、合规性）对生成式数据的安全/隐私需求进行分类。
评述并综合各属性的最先进对策（如记忆保护、差分隐私、水印、访问控制、可追溯性）。
比较现有综述，突出聚焦于生成式数据而非广义 AIGC 的差距。
总结可证实的挑战与未来方向，为可信任的生成数据提供指引。

实验结果

研究问题

RQ1训练生成模型所用真实数据的隐私风险与保护措施有哪些（AIGC 的隐私 vs 为隐私的 AIGC）？
RQ2如何实现可控性（访问控制与可追溯性），以主动防止生成数据的滥用？
RQ3存在哪些确保生成数据真实性的方法（检测与归因）及其效果？
RQ4生成数据适用的监管与合规要求包括非毒性与事实性等方面有哪些？
RQ5在 AIGC 中保障生成数据安全与防护有哪些未解挑战与未来方向？

主要发现

隐私威胁包括大型模型对训练数据的记忆化以及输出中对训练数据的再现。
差分隐私、去重和记忆拒绝等技术可缓解隐私风险，但可能影响实用性。
为隐私而设的 AIGC 使用虚拟内容来保护真实数据隐私，扩散模型在隐私保护方面具有强大生成能力。
可控性策略包括通过扰动实现的访问控制及通过水印实现的鲁棒可追溯性（针对模型和数据）。
水印技术有助于版权保护、真实性检查以及在整个生成管线中的内容可追溯性。
合规性关注点强调生成数据的非毒性与事实性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。