QUICK REVIEW

[Paper Review] Backdoor Attack in the Physical World

Yiming Li, Tongqing Zhai|arXiv (Cornell University)|Apr 6, 2021

Adversarial Robustness in Machine Learning16 references36 citations

TL;DR

The paper shows that backdoor attacks with static triggers are fragile when testing triggers differ from training in the physical world, and proposes a transformation-based defense and an enhanced attack robust to such transformations, with evaluation on CIFAR-10 and a physical-world demonstration.

ABSTRACT

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing trigger characteristics. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. As such, those attacks are far less effective in the physical world, where the location and appearance of the trigger in the digitized image may be different from that of the one used for training. Moreover, we also discuss how to alleviate such vulnerability. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor attack and defense methods.

Motivation & Objective

Investigate whether static-trigger backdoor attacks remain effective when testing-time triggers differ in location or appearance from training-time triggers.
Assess the vulnerability of existing static-trigger backdoor attacks to image transformations.
Propose a transformation-based defense to mitigate such attacks without model or data changes.
Propose an enhanced backdoor attack that remains effective under common image transformations.
Demonstrate the connection between enhanced attacks and physical-world backdoor scenarios.

Proposed method

Model- and data-setting: BadNets on CIFAR-10 using VGG-19 and ResNet-34 with a 3x3 black-gray trigger.
Characterize backdoor trigger by two independent attributes: location and appearance (minimum covering box and trigger pattern).
Evaluate attack success rate (ASR) under small location shifts and appearance changes of the trigger at inference.
Propose a transformation-based defense that preprocesses testing images via transformations (e.g., flipping, scaling).
Develop an enhanced attack by training with a set of transformed poison images using a parameterized transformation family Theta; use a sampling approach to approximate the full transformation space.
Show how enhanced attacks relate to and can succeed in physical-world settings where digitization induces transformations.

Experimental results

Research questions

RQ1Do static-trigger backdoor attacks remain effective when the test-time trigger differs in location or appearance from the training trigger?
RQ2Can a simple transformation-based preprocessing defense reduce backdoor effectiveness without model or data access?
RQ3Can backdoor attacks be enhanced to remain robust under common transformations, including those encountered in the physical world?
RQ4How do enhanced attacks perform under transformation-based defenses, and do they translate to physical-world effectiveness?

Key findings

Static-trigger attacks are sensitive to trigger location; small shifts (a few pixels) can drop ASR from near 100% to below 50%.
Changing trigger appearance, even modestly, degrades ASR significantly, indicating vulnerability to appearance changes.
ShrinkPad4 defense reduces ASR by over 90% across examined attacks and models, while Flip defends some attacks effectively; Auto-Encoder is generally less effective in reducing ASR while preserving clean accuracy.
Enhanced backdoor attacks (with random transformations during training) maintain high ASR under transformation-based defenses, outperforming standard attacks in most tested configurations.
In physical-world tests, BadNets+ (enhanced attack) succeeds across real-world captures, whereas standard BadNets fails, demonstrating the practical link between enhancement and physical backdoors.
The work connects defense via transformations to robustness against physical-world trigger variations and shows the potential to inspire more robust attack/defense methods.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.