[논문 리뷰] Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples
NGDS combines deductive symbolic search with neural guidance to quickly synthesize user-intended programs from few input-output examples, achieving real-time performance while matching PROSE accuracy and outperforming state-of-the-art neural methods in speed.
Synthesizing user-intended programs from a small number of input-output examples is a challenging problem with several important applications like spreadsheet manipulation, data wrangling and code refactoring. Existing synthesis systems either completely rely on deductive logic techniques that are extensively hand-engineered or on purely statistical models that need massive amounts of data, and in general fail to provide real-time synthesis on challenging benchmarks. In this work, we propose Neural Guided Deductive Search (NGDS), a hybrid synthesis technique that combines the best of both symbolic logic techniques and statistical models. Thus, it produces programs that satisfy the provided specifications by construction and generalize well on unseen examples, similar to data-driven systems. Our technique effectively utilizes the deductive search framework to reduce the learning problem of the neural component to a simple supervised learning setup. Further, this allows us to both train on sparingly available real-world data and still leverage powerful recurrent neural network encoders. We demonstrate the effectiveness of our method by evaluating on real-world customer scenarios by synthesizing accurate programs with up to 12x speed-up compared to state-of-the-art systems.
연구 동기 및 목표
- Motivate faster, reliable program synthesis from few examples by merging symbolic deduction with neural guidance.
- Leverage a branch-and-bound controller with learned scores to prune unproductive sub-branches in the DSL search space.
- Retain correctness and generalization guarantees from symbolic methods while improving real-time performance.
제안 방법
- Use a PROSE-style deductive search over a DSL for string transformations (FlashFill DSL).
- Train neural score models to predict the best generalization potential for each DSL production given the spec.
- Integrate a branch selection controller (threshold-based or branch-and-bound) to explore only promising branches.
- Employ a supervised learning setup by mapping spec, production, and realized best scores to a regression objective.
- Possibly train separate models per DSL level or production to simplify learning.
실험 결과
연구 질문
- RQ1Can neural predictions reliably indicate which DSL branches lead to the best generalizable programs?
- RQ2Does NGDS improve synthesis speed without sacrificing accuracy compared to purely symbolic or purely neural approaches?
- RQ3How do different branch-selection controllers affect accuracy and runtime in real-world tasks?
- RQ4Is a supervised learning formulation sufficient to guide branch selection given PROSE’s existing witness functions?
- RQ5Do separate models per DSL level or production yield better generalization and efficiency?
주요 결과
| Method | Validation Accuracy (%) | Test Accuracy (%) | Speed-up (x PROSE) | % of branches taken (if reported) |
|---|---|---|---|---|
| PROSE | 67.12 | 67.12 | 1.00 | 100.00 |
| NGDS(T1, Thr) | 59.57 | 67.12 | 1.27 | 62.72 |
| NGDS(T1, BB) | 63.83 | 68.49 | 1.22 | 51.78 |
| NGDS(T1, BB 0.2) | 61.70 | 67.12 | 1.22 | 63.16 |
| NGDS(T1+PP, Thr) | 59.57 | 67.12 | 0.97 | 56.41 |
| NGDS(T1+PP, BB) | 61.70 | 72.60 | 0.89 | 50.22 |
| NGDS(T1+PP, BB 0.2) | 61.70 | 67.12 | 0.86 | 56.43 |
| NGDS(T1+POS, Thr) | 61.70 | 67.12 | 1.93 | 55.63 |
| NGDS(T1+POS, BB) | 63.83 | 68.49 | 1.67 | 50.44 |
| NGDS(T1+POS, BB 0.2) | 63.83 | 67.12 | 1.73 | 55.73 |
- NGDS achieves 68.49% generalization accuracy on 73 test tasks with one example, matching PROSE accuracy while delivering speed-ups up to 12x on challenging tasks.
- Compared to RobustFill and DeepCoder, NGDS with a single example is more accurate and significantly faster.
- Various NGDS ablations show performance depends on the combination of score model (e.g., T1, PP, POS) and controller (threshold or branch-and-bound).
- Table 1 indicates PROSE baseline accuracy 67.12% on test tasks with speed-up 1.00x; NGDS variants reach comparable or higher test accuracy with substantial speed-ups (e.g., 1.22–1.93x in several configurations).
- Table 2 shows ablations where speed-ups range from 0.86x to 1.93x depending on controller and model combination, with test accuracies generally near PROSE levels.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.