QUICK REVIEW

[論文レビュー] Misleading Authorship Attribution of Source Code using Adversarial Learning

Erwin Quiring, Alwin Maier|arXiv (Cornell University)|May 29, 2019

Advanced Malware Detection Techniques被引用数 41

ひとこと要約

この論文は、機械学習ベースのソースコード著者推定に対するブラックボックスの敵対的攻撃を提示します。モンテ＝カルロ木探索に導かれた Semantics-preserving なコード変換を用いて分類器を欺き、精度を大幅に低下させ、なりすましを可能にします。

ABSTRACT

In this paper, we present a novel attack against authorship attribution of\nsource code. We exploit that recent attribution methods rest on machine\nlearning and thus can be deceived by adversarial examples of source code. Our\nattack performs a series of semantics-preserving code transformations that\nmislead learning-based attribution but appear plausible to a developer. The\nattack is guided by Monte-Carlo tree search that enables us to operate in the\ndiscrete domain of source code. In an empirical evaluation with source code\nfrom 204 programmers, we demonstrate that our attack has a substantial effect\non two recent attribution methods, whose accuracy drops from over 88% to 1%\nunder attack. Furthermore, we show that our attack can imitate the coding style\nof developers with high accuracy and thereby induce false attributions. We\nconclude that current approaches for authorship attribution are inappropriate\nfor practical application and there is a need for resilient analysis\ntechniques.\n

研究の動機と目的

Motivate and evaluate the robustness of ML-based source code authorship attribution.
Demonstrate that semantics-preserving, transformations can mislead attribution methods.
Propose a black-box attack framework combining code transformations with Monte-Carlo tree search.
Assess the feasibility of untargeted and targeted impersonation attacks on real programmer data.

提案手法

Develop semantics-preserving code transformations using Clang front-end across five transformation families.
Represent code with AST, CFG with use-define chains, and declaration-reference mappings to enable safe transformations.
Construct a transformation sequence via Monte-Carlo tree search to reach a target in feature space without breaking semantics.
Operate under a black-box threat model that only uses classifier outputs for guidance.
Evaluate untargeted and targeted attacks on two attribution methods (Caliskan et al. RF and Abuhamad et al. LSTM) using a Google Code Jam dataset.

実験結果

リサーチクエスチョン

RQ1Can a black-box adversary meaningfully reduce accuracy of ML-based source code authorship attribution?
RQ2Are untargeted attacks (dodging) effective at misleading attribution to any other author?
RQ3Are targeted impersonation attacks feasible, enabling attribution to a chosen developer?
RQ4How do code transformations preserve semantics while altering stylistic features used for attribution?
RQ5What is the practicality and plausibility of transformed code to hide manipulation while remaining legitimate?

主な発見

Method	Lexical/Syntax Features Used	Classifier	Accuracy (mean ± std)
Caliskan et al. [9]	Lexical; Lex	RF	90.4% ± 1.7%
Abuhamad et al. [1]	Lexical only	LSTM	88.4% ± 3.7%

Attack reduces accuracy of two recent attribution methods from over 88% to 1%.
Targeted impersonation achieves 77%–81% success rate across developers on average.
A 15-participant study indicates transformed code remains plausible and hard to distinguish from unmodified code.
Experiment setup uses 1,632 C++ files from 204 authors solving 8 GCJ challenges, with k-fold cross-validation across challenges.
Transformations are solely lexical/syntactic (no layout changes) and rely on a large set of 36 transformers organized into five families.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。