QUICK REVIEW

[論文レビュー] NL2Dashboard: A Lightweight and Controllable Framework for Generating Dashboards with LLMs

Boshen Shi, Kexin Yang|arXiv (Cornell University)|Jan 4, 2026

Data Visualization and Analytics被引用数 0

ひとこと要約

NL2Dashboard は IR ベースの二段階ワークフロー（Prompt-to-IR および IR-to-Dashboard）とダッシュボードを生成・修正する多エージェントシステムを導入し、ベースラインと比べてトークン効率と細かな制御性を向上させる。

ABSTRACT

While Large Language Models (LLMs) have demonstrated remarkable proficiency in generating standalone charts, synthesizing comprehensive dashboards remains a formidable challenge. Existing end-to-end paradigms, which typically treat dashboard generation as a direct code generation task (e.g., raw HTML), suffer from two fundamental limitations: representation redundancy due to massive tokens spent on visual rendering, and low controllability caused by the entanglement of analytical reasoning and presentation. To address these challenges, we propose NL2Dashboard, a lightweight framework grounded in the principle of Analysis-Presentation Decoupling. We introduce a structured intermediate representation (IR) that encapsulates the dashboard's content, layout, and visual elements. Therefore, it confines the LLM's role to data analysis and intent translation, while offloading visual synthesis to a deterministic rendering engine. Building upon this framework, we develop a multi-agent system in which the IR-driven algorithm is instantiated as a suite of tools. Comprehensive experiments conducted with this system demonstrate that NL2Dashboard significantly outperforms state-of-the-art baselines across diverse domains, achieving superior visual quality, significantly higher token efficiency, and precise controllability in both generation and modification tasks.

研究の動機と目的

LLMs で個別のチャートではなく、包括的なダッシュボードの生成という課題に取り組む。
データ分析と視覚レンダリングを構造化された中間表現（IR）でデカップリングする。
決定論的レンダリングと誘導プロンプトを通じて、可 controllable で反復的なダッシュボード生成・修正を可能にする。
実行可能なツールを備えたエージェント系システムを導入してダッシュボードを組み立てる。
改良された信頼性と効率性を理論的・経験的に検証する。

提案手法

ダッシュボードの内容・レイアウト・ビジュアルを符号化する軽量で構造化された IR を導入する。
二段階ワークフローを促進する：Prompt-to-IR（分析と IR の生成）と IR-to-Dashboard（基盤テンプレートを用いた決定論的レンダリング）。
編集意図を原子アクション（change、swap、delete、add）へ翻訳し IR 更新演算子を用いる変更パイプラインを実装する。
プランナー、コーダー、クリティックからなる多エージェントシステムと、IRGen、DBCompile、IRModify からなるダッシュボード組立ツールキットを開発して生成と修正を調整する。
エントロピー分解と Fano の不等式に基づく理論分析を提供し、信頼性の向上と視覚的エントロピーの低減を正当化する。
ドメイン横断で品質・トークン効率・制御性の観点で NL2Dashboard をベースラインと比較した実証的評価。

実験結果

リサーチクエスチョン

RQ1多様なドメインで高品質なダッシュボードを生成する能力はどうか。
RQ2ユーザーが指定したダッシュボードの修正をどれだけ忠実に実行できるか。
RQ3Generative Overhead Ratio（GOR）で測る NL2Dashboard のトークン効率はエンドツーエンドのベースラインと比べてどうか。
RQ4クリティックを用いた反復最適化がダッシュボード品質に与える影響は何か、いつ限界効果が現れるか。

主な発見

NL2Dashboard は評価指標で最高品質スコアを達成し、生成と修正のタスクで第二位のベースラインよりそれぞれ 8.4% および 7.3% の改善を示した。
修正タスクでは NL2Dashboard が全タスクを正確に完遂し、タスク難易度が上がるにつれて成功率でベースラインを 35%–62% 上回った。
NL2Dashboard のトークン効率（GOR）は 1 を大きく下回る初期値を示し、コードやスクリプトを生成するベースラインに比べてトークンオーバーヘッドが低い。
クリティックベースの最適化はダッシュボードの品質を多次元で改善するが、約1回の最適化ラウンドを超えるとリターンが逓減する。
アブレーション研究は IR ベースのデカップリングがレイアウト関連の失敗を低減し、修正を安定化させることを示し、ベースラインで見られた空間推論および指示順守の問題に対処する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。