QUICK REVIEW

[论文解读] Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming

Philip Sosnin, Jodie Knapp|arXiv (Cornell University)|Feb 18, 2026

Adversarial Robustness in Machine Learning被引用 0

一句话总结

该论文提出一个可靠且完整的验证框架，通过将整个训练–攻击–评估过程表述为一个单一的MIQCP，能对神经网络训练中的数据污染进行鲁棒性证明，给出在固定训练流程下的最优污染攻击与精确鲁棒性保证。

ABSTRACT

This work introduces a verification framework that provides both sound and complete guarantees for data poisoning attacks during neural network training. We formulate adversarial data manipulation, model training, and test-time evaluation in a single mixed-integer quadratic programming (MIQCP) problem. Finding the global optimum of the proposed formulation provably yields worst-case poisoning attacks, while simultaneously bounding the effectiveness of all possible attacks on the given training pipeline. Our framework encodes both the gradient-based training dynamics and model evaluation at test time, enabling the first exact certification of training-time robustness. Experimental evaluation on small models confirms that our approach delivers a complete characterization of robustness against data poisoning.

研究动机与目标

说明对训练时鲁棒性对数据污染攻击进行形式化验证的必要性。
提出一个联合训练–攻击–评估的公式化方法，能够获得精确的认证。
开发基于MIQCP的框架，将数据扰动、训练动力学和测试时评估编码其中。
在小模型上展示精确认证，并提供提升可扩展性的策略。

提出的方法

将数据扰动、训练动力学和测试时评估表述为一个单一的MIQCP。
用二进制和连续变量编码有界及任意的数据污染威胁模型。
用大M约束的双线性不等式和铰链损失以MIP友好的方式表示ReLU激活。
在MIQCP中编码前向/后向传递、参数更新和测试时预测。
求解MIQCP以达到最优，从而获得最坏情形的污染攻击与精确鲁棒性证书。

Figure 1: Comparison of formulation tightness and optimization progress for the Iris dataset under an unbounded attack model ( $n=8$ ). Left: Tightness of the test-data Big-M constants, defined as $|U^{(i,\mathrm{test})}-L^{(i,\mathrm{test})}|$ . Right: Objective value (dashed) and dual bound (dotte

实验结果

研究问题

RQ1在给定威胁模型下，是否可以对特定训练过程的训练时鲁棒性进行精确认证？
RQ2如何将整个训练流程编码为一个单一的MIQCP，以避免松弛与过估计？
RQ3哪些求解策略能提高对小模型的基于MIQCP的精确认证的实际性能？
RQ4在有界与任意替换威胁模型下，最优污染策略的经验特征如何？

主要发现

在固定初始化和数据顺序条件下，MIQCP公式能给出对数据污染鲁棒性的可靠且完整的证书。
该方法能够在指定的威胁模型下计算出证据性最优的污染攻击以及精确的鲁棒性保证。
对小模型的实证结果表明可以实现精确认证，并揭示最优污染策略的结构。
提出的改进（启发式、边界收紧、及辅助变量表述）有助于控制计算成本并收紧松弛。

Figure 2: Optimal denial of service attack on the Diabetes dataset with $n=50,\epsilon=0,\nu=0.5$ . The red squares depict the points poisoned by the adversary. The rightmost figure depicts the training loss for the poisoned (green) vs original (blue) datasets.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。