QUICK REVIEW

[Paper Review] Radioactive data: tracing through training

Alexandre Sablayrolles, Matthijs Douze|arXiv (Cornell University)|Feb 3, 2020

Adversarial Robustness in Machine Learning38 references33 citations

TL;DR

The paper introduces radioactive data, a method to imprint imperceptible marks in a training dataset so that a model trained on it can be statistically identified as having been trained with that data, with p-values as low as 1e-4 even if only 1% of the data is marked.

ABSTRACT

We want to detect whether a particular image dataset has been used to train a model. We propose a new technique, \emph{radioactive data}, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. The mark is robust to strong variations such as different architectures or optimization methods. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value). Our experiments on large-scale benchmarks (Imagenet), using standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we can detect usage of radioactive data with high confidence (p<10^-4) even when only 1% of the data used to trained our model is radioactive. Our method is robust to data augmentation and the stochasticity of deep network optimization. As a result, it offers a much higher signal-to-noise ratio than data poisoning and backdoor methods.

Motivation & Objective

Enable traceability to determine if a dataset was used to train a model with statistical guarantees.
Develop a data marking technique that preserves task performance and is robust to training variations.
Provide both white-box and black-box detection methods for identifying radioactive data usage.

Proposed method

Introduce a class-specific additive mark (data isotopes) in latent space before the classification layer.
Backpropagate marks to image pixels to create visually imperceptible modifications (PSNR around 42 dB).
In the white-box setting, align feature extractor subspaces when training with different φ networks using a linear mapping M and regression.
Use cosine similarity between the carrier direction u and the learned classifier to test for radioactive data via a beta-incomplete distribution.
Combine multiple p-values across classes with Fisher’s method when marking multiple classes.
Provide black-box detection by comparing losses on marked vs vanilla samples or via distilled student models.

Experimental results

Research questions

RQ1Can a dataset be marked with imperceptible changes that persist through training across architectures and optimizers?
RQ2Does a statistical test on the learned classifier (or latent space) reveal the presence of marked data with high confidence?
RQ3How robust is the marking technique to data augmentation, architecture transfers, and training from scratch?
RQ4What is the minimal fraction of marked data needed to detect radioactive data with a given p-value?
RQ5How does the technique compare to backdoor and data-poisoning methods in detectability and robustness?

Key findings

Radioactive marks can be detected with high confidence (p < 1e-4) when as little as 1% of training data is marked.
Detection is robust to data augmentation and stochastic training procedures across architectures (ResNet-18, ResNet-50, VGG-16, DenseNet-121).
Marking preserves model accuracy within about ±0.1% when 1% of data is marked.
White-box and black-box detection are feasible, with white-box often yielding stronger signals and center-crop augmentations increasing detectability.
Transfer to different datasets and architectures still yields strong detection signals, e.g., Places205 marking with Imagenet-pretrained marking shows detectable radioactivity when 10%+ of data is marked.
Ablation analyses indicate the mark aligns the classifier along a carrier direction, while the semantic direction remains influential, explaining limited accuracy loss.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.