QUICK REVIEW

[論文レビュー] Context2Name: A Deep Learning-Based Approach to Infer Natural Variable Names from Usage Contexts

Rohan Bavishi, Michael Pradel|arXiv (Cornell University)|Aug 31, 2018

Software Engineering Research参考文献 37被引用数 43

ひとこと要約

Context2Name はトークンベースの静的解析とシーケンスオートエンコーダおよび RNN を組み合わせて、minified JavaScript の usage contexts から自然な変数名を予測し、高い精度と高速性を達成します。 JSNice と JSNaughty と比較して有利で、独自の予測を追加します。

ABSTRACT

Most of the JavaScript code deployed in the wild has been minified, a process in which identifier names are replaced with short, arbitrary and meaningless names. Minified code occupies less space, but also makes the code extremely difficult to manually inspect and understand. This paper presents Context2Name, a deep learningbased technique that partially reverses the effect of minification by predicting natural identifier names for minified names. The core idea is to predict from the usage context of a variable a name that captures the meaning of the variable. The approach combines a lightweight, token-based static analysis with an auto-encoder neural network that summarizes usage contexts and a recurrent neural network that predict natural names for a given usage context. We evaluate Context2Name with a large corpus of real-world JavaScript code and show that it successfully predicts 47.5% of all minified identifiers while taking only 2.9 milliseconds on average to predict a name. A comparison with the state-of-the-art tools JSNice and JSNaughty shows that our approach performs comparably in terms of accuracy while improving in terms of efficiency. Moreover, Context2Name complements the state-of-the-art by predicting 5.3% additional identifiers that are missed by both existing tools.

研究の動機と目的

Minimize and address the challenge of minified JavaScript code readability by inferring meaningful variable names.
Develop a deep learning framework that predicts natural names from usage contexts without heavy language-specific program analysis.
Show empirical effectiveness and efficiency on a large real-world JavaScript corpus and compare with state-of-the-art tools.

提案手法

Extract usage contexts as sequences of tokens around each minified name.
Convert usage contexts to sparse vectors via one-hot encoding of tokens.
Compress contexts with a sequence auto-encoder into dense embeddings.
Predict names with a recurrent neural network that outputs a ranked list of candidate names.
Greedily assign predicted names ensuring semantics-preserving constraints in the final code.

実験結果

リサーチクエスチョン

RQ1RQ1: How effectively can Context2Name predict natural names for minified variables and functions in real-world JavaScript?
RQ2RQ2: How does Context2Name compare to JSNice and JSNaughty in accuracy and efficiency?
RQ3RQ3: Is the approach efficient and scalable for large programs?

主な発見

指標	Context2Name	JSNice	JSNaughty	JSNaughty ∞	Baseline
Local-Once	47.5%	48.3%	39.4%	55.3%	0.0%
Local-Repeat	49.8%	55.3%	41.3%	59.2%	0.0%
All-Once	55.4%	56.0%	47.7%	61.9%	15.0%
All-Repeat	58.1%	62.6%	49.3%	65.8%	16.4%

Context2Name exactly recovers 47.5% of minified local names in the dataset.
Context2Name predicts locally with 2.9 ms per name and 110.7 ms per file on average.
Compared to JSNice and JSNaughty, Context2Name achieves comparable accuracy with improved efficiency.
Context2Name recovers 5.3% additional identifiers that are missed by both JSNice and JSNaughty.
The approach processes files quickly and does not rely on heavy language-specific program analysis.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。