[论文解读] Towards Probabilistic Verification of Machine Unlearning
该论文提出一个概率框架,通过后门数据中毒在 MLaaS 中验证机器可学习的删除,形式化验证为假设检验,并在实际参与水平下显示高置信度的检测。
The right to be forgotten, also known as the right to erasure, is the right of individuals to have their data erased from an entity storing it. The status of this long held notion was legally solidified recently by the General Data Protection Regulation (GDPR) in the European Union. Consequently, there is a need for mechanisms whereby users can verify if service providers comply with their deletion requests. In this work, we take the first step in proposing a formal framework to study the design of such verification mechanisms for data deletion requests -- also known as machine unlearning -- in the context of systems that provide machine learning as a service (MLaaS). Our framework allows the rigorous quantification of any verification mechanism based on standard hypothesis testing. Furthermore, we propose a novel backdoor-based verification mechanism and demonstrate its effectiveness in certifying data deletion with high confidence, thus providing a basis for quantitatively inferring machine unlearning. We evaluate our approach over a range of network architectures such as multi-layer perceptrons (MLP), convolutional neural networks (CNN), residual networks (ResNet), and long short-term memory (LSTM), as well as over 5 different datasets. We demonstrate that our approach has minimal effect on the ML service's accuracy but provides high confidence verification of unlearning. Our proposed mechanism works even if only a handful of users employ our system to ascertain compliance with data deletion requests. In particular, with just 5% of users participating, modifying half their data with a backdoor, and with merely 30 test queries, our verification mechanism has both false positive and false negative ratios below $10^{-3}$. We also show the effectiveness of our approach by testing it against an adaptive adversary that uses a state-of-the-art backdoor defense method.
研究动机与目标
- 将机器去学习验证形式化为一个假设检验问题,以量化对删除请求的合规性。
- 引入一个基于后门的机制,隐私关注的用户对一定比例的数据进行中毒以创建可验证的痕迹。
- 在不同服务器行为(自适应/非自适应)下提供验证置信度的理论保证和闭式表达式。
- 在多数据集和多架构上进行实证验证,显示在有限参与和测试查询下具有高置信度。
提出的方法
- 将去学习验证建模为一个假设检验,比较已删除与未删除数据场景。
- 使用基于用户的后门中毒,在基于被污染数据训练的模型中创建可检测的痕迹。
- 将 p 和 q 定义为在 H1(未删除)和 H0(已删除)下的后门成功概率。
- 基于二项分布与检验阈值推导出删除置信度 rho_A,alpha(s,n) 的闭式表达式。
- 分析在有限查询下对 p 和 q 的估计,并为单个用户情景提供放宽条件。
- 在五个数据集(EMNIST, FEMNIST, CIFAR10, ImageNet, AG News)和四种架构(MLP, CNN, ResNet, LSTM)上进行评估。
实验结果
研究问题
- RQ1基于后门的策略能否在 MLaaS 中实现对数据删除的高置信度验证?
- RQ2需要多少个带后门的测试样本(n)以及哪个参与比例(f_user)才能实现低假阳性/假阴性率?
- RQ3自适应服务器对后门的防御如何影响验证性能?
- RQ4结果是否在不同数据集和模型架构上具备普遍性?
- RQ5多位隐私爱好者如何在不损害模型效能的前提下共同提升验证效果?
主要发现
- 在实验中,提出的基于后门的机制在 50% 被污染的数据和 30 次测试查询下实现了低假阳性和假阴性率(低于 1e-3)。
- 在5个数据集和4种架构下,该方法在维持模型准确性的同时展示出较高的验证置信度。
- 采用后门防御的自适应服务器降低了后门成功率,但仍然能够实现高置信度的去学习验证。
- 即使只有5%的参与用户,该框架仍然有效,且组合多个用户可提高验证性能。
- 论文提供了删除置信度的闭式解析表达,并讨论了在实际情境中对 p 和 q 的实际估计。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。