[Paper Review] Tree-to-tree Neural Networks for Program Translation
Introduces a tree-to-tree neural network with attention (including a parent attention feeding mechanism) to translate programs by aligning source sub-trees to target sub-trees; outperforms state-of-the-art baselines on CoffeeScript↔JavaScript and Java→C# benchmarks.
Program translation is an important tool to migrate legacy code in one language into an ecosystem built in a different language. In this work, we are the first to employ deep neural networks toward tackling this problem. We observe that program translation is a modular procedure, in which a sub-tree of the source tree is translated into the corresponding target sub-tree at each step. To capture this intuition, we design a tree-to-tree neural network to translate a source tree into a target one. Meanwhile, we develop an attention mechanism for the tree-to-tree model, so that when the decoder expands one non-terminal in the target tree, the attention mechanism locates the corresponding sub-tree in the source tree to guide the expansion of the decoder. We evaluate the program translation capability of our tree-to-tree model against several state-of-the-art approaches. Compared against other neural translation models, we observe that our approach is consistently better than the baselines with a margin of up to 15 points. Further, our approach can improve the previous state-of-the-art program translation approaches by a margin of 20 points on the translation of real-world projects.
Motivation & Objective
- Motivate program translation as a modular, tree-structured task that can benefit from neural models.
- Propose a tree-to-tree encoder–decoder architecture that translates source parse trees into target parse trees.
- Incorporate an attention mechanism to locate corresponding source sub-trees during target tree expansion.
- Enhance the model with a parent-attention feeding mechanism to propagate attention information down the decoding tree.
Proposed method
- Convert source and target parse trees into binary (Left-Child Right-Sibling) representations.
- Use a Tree-LSTM encoder to compute embeddings for source trees and sub-trees.
- Decode into a target tree with a queue-driven expansion, predicting node values via softmax over a shared vocabulary.
- Compute attention weights over source sub-trees to form an embedding e_s, then combine with the decoder state to produce e_t.
- Apply a parent attention feeding mechanism by feeding the parent attention vector e_t into the decoders for left/right children.
Experimental results
Research questions
- RQ1Can a tree-to-tree neural architecture outperform sequence-to-sequence and tree-based baselines for program translation?
- RQ2Does an attention mechanism over source parse trees improve translation quality, and how does the proposed parent-attention feeding affect performance?
- RQ3How well does the model scale to different language pairs and program lengths in real-world codebases?
Key findings
- The tree-to-tree model consistently outperforms baselines on program translation tasks, with up to 15-point gains in program accuracy on benchmarks.
- On CoffeeScript→JavaScript datasets, the model achieves up to ~20-point improvements over the best baselines, especially for longer programs.
- Without attention the model performs near 0% on several tasks, while attention raises performance to over 90% in some settings.
- For Java→C#, the tree-to-tree approach significantly outperforms previous SMT-based methods by about 20% in program accuracy on real-world projects (varies by project).
- Incorporating the parent-attention feeding mechanism yields substantial performance gains over variants without it, particularly as tree size grows.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.