Skip to main content
QUICK REVIEW

[論文レビュー] Adversarial Robustness for Code

Pavol Bielik, Martin Vechev|arXiv (Cornell University)|Feb 11, 2020
Adversarial Robustness in Machine Learning参考文献 45被引用数 33
ひとこと要約

要約:論文は uncertain predictions を回避することで正確で敵対的に頑健なコードモデルを構築する枠組みを導入し、コード領域で adversarial training を適用し、入力表現を洗練させる。タイプ推論に適用して、精度を維持しつつ頑健性を高める notable gains を得る。

ABSTRACT

Machine learning and deep learning in particular has been recently used to successfully address many tasks in the domain of code such as finding and fixing bugs, code completion, decompilation, type inference and many others. However, the issue of adversarial robustness of models for code has gone largely unnoticed. In this work, we explore this issue by: (i) instantiating adversarial attacks for code (a domain with discrete and highly structured inputs), (ii) showing that, similar to other domains, neural models for code are vulnerable to adversarial attacks, and (iii) combining existing and novel techniques to improve robustness while preserving high accuracy.

研究の動機と目的

  • Motivate and address robustness gaps in neural models of code across tasks like type inference and code analysis.
  • Propose a three-component framework combining abstention, adversarial training, and representation refinement.
  • Demonstrate that robustness can be improved while preserving or enhancing accuracy on code tasks.

提案手法

  • Train a model with an abstention mechanism to separate predictions into confident and abstain regions.
  • Instantiate adversarial training by considering valid program modifications and optimizing worst-case loss.
  • Refine input representations by learning a program-specific abstraction that retains only the parts relevant to the prediction.
  • Train multiple specialized models, each learning a representation tailored to a subset of the data, to improve overall robustness and coverage.
  • Represent programs as graphs and use graph neural networks; formulate refinement as an edge-sparsification problem solvable via ILP.
  • Iteratively annotate data with robust predictions to further expand accurate and robust coverage.

実験結果

リサーチクエスチョン

  • RQ1Can abstention improve robustness without sacrificing overall accuracy in code models?
  • RQ2How much can adversarial training improve robustness for code tasks, and is it sufficient alone?
  • RQ3Does learning a refined, task-relevant representation (abstraction) enhance robustness to adversarial program changes?
  • RQ4Does training multiple specialized models increase robust coverage of a dataset compared to a single model?

主な発見

ModelAccuracyRobustness
LSTM88.2 ± 0.244.9 ± 1.3
DeepTyper88.4 ± 0.252.4 ± 1.2
GCN82.6 ± 0.649.1 ± 1.1
GNT89.3 ± 0.947.4 ± 1.0
GGNN86.7 ± 0.452.1 ± 0.4
LSTM (adv trained)87.5 ± 0.451.9 ± 1.3
DeepTyper (adv trained)87.1 ± 0.355.1 ± 2.6
GCN (adv trained)81.9 ± 0.549.3 ± 3.1
GNT (adv trained)88.3 ± 0.450.0 ± 0.5
GGNN (adv trained)86.1 ± 0.257.9 ± 1.5
GNT (abstain+adv+refinement, t_acc=1.00)99.93%99.98%
GGNN (abstain+adv+refinement, t_acc=1.00)99.80%99.01%
GNT (abstain+adv+refinement, t_acc=0.00)86.6%62.3%
GGNN (abstain+adv+refinement, t_acc=0.00)87.7%67.0%
  • For type prediction in JavaScript/TypeScript, the approach achieves 87.7% accuracy with robustness improved from 52.1% to 67.0%.
  • With abstention, adversarial training, and representation refinement, some models reach near-perfect robust performance on subsets of data (e.g., 99.93% accuracy and 99.98% robustness for 29% of samples under t_acc=1.00).
  • Training multiple robust models improves coverage: approximately 30% robustly accurate on a subset at t_acc=1.00 for GNT (99.98% robustness); GGNN achieves 99.01% robustness on the same setting.
  • Adversarial training alone yields limited robustness gains (up to ~7% in some models).
  • Representation refinement via learned abstractions and ILP-based edge pruning substantially enhances robustness when combined with abstention and adversarial training.
  • The approach demonstrates robustness-accuracy tradeoffs and highlights the benefit of decomposing the problem into abstain, adversarial training, and representation refinement.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。