Skip to main content
QUICK REVIEW

[论文解读] Athanor: Authoring Action Modification-based Interactions on Static Visualizations via Natural Language

Can Liu, Jaeuk Lee|arXiv (Cornell University)|Jan 25, 2026
Data Visualization and Analytics被引用 0
一句话总结

Athanor通过行动-修改设计空间、一个多代理需求分析器以及一个实现无关的可视化抽象翻译器,将静态可视化转化为基于自然语言的交互功能。

ABSTRACT

Interactivity is crucial for effective data visualizations. However, it is often challenging to implement interactions for existing static visualizations, since the underlying code and data for existing static visualizations are often not available, and it also takes significant time and effort to enable interactions for them even if the original code and data are available. To fill this gap, we propose Athanor, a novel approach to transform existing static visualizations into interactive ones using multimodal large language models (MLLMs) and natural language instructions. Our approach introduces three key innovations: (1) an action-modification interaction design space that maps visualization interactions into user actions and corresponding adjustments, (2) a multi-agent requirement analyzer that translates natural language instructions into an actionable operational space, and (3) a visualization abstraction transformer that converts static visualizations into flexible and interactive representations regardless of their underlying implementation. Athanor allows users to effortlessly author interactions through natural language instructions, eliminating the need for programming. We conducted two case studies and in-depth interviews with target users to evaluate our approach. The results demonstrate the effectiveness and usability of our approach in allowing users to conveniently enable flexible interactions for static visualizations.

研究动机与目标

  • 在无法获得源数据和代码的情况下,激发并解决为静态可视化添加交互性的难题。
  • 提出一个结构化设计空间,将用户操作映射到可视化修改以实现交互创作。
  • 引入一个多代理系统,将自然语言需求翻译为可执行的交互规范。
  • 提出一种实现无关的可视化表示,以支持多样的可视化工具包。

提出的方法

  • 定义一个将用户动作(如悬停、点击、缩放)与可视化修改(如强调、过滤、编码)联系起来的行动-修改交互设计空间。
  • 开发一个具有翻译、纠错、引导代理的多代理需求分析器,将自然语言需求转化为可执行规范。
  • 创建一个可视化抽象翻译器,将静态可视化转换为基于约束、实现无关的表示,使用带有控制点的约束模型。
  • 使用基于MLLM的解析器通过提取元素、识别角色并构建约束,将SVG可视化转换为基于约束的表示。
  • 使基于约束的修改在保持外观的同时实现交互更新,并通过控制点之间的空间关系进行引导。
Figure 1 : Users can upload existing visualizations in the chart view and present their authoring requirements (a) in the dialogue view. The multi-agent requirement analyzer employs translation, correction, and guidance agents that work collaboratively to translate users’ requirements into specifica
Figure 1 : Users can upload existing visualizations in the chart view and present their authoring requirements (a) in the dialogue view. The multi-agent requirement analyzer employs translation, correction, and guidance agents that work collaboratively to translate users’ requirements into specifica

实验结果

研究问题

  • RQ1如何将自然语言指示可靠转化为静态可视化的可执行交互规范?
  • RQ2基于约束、实现无关的表示在支持多样化可视化类型和工具包的同时,是否能够实现交互修改?
  • RQ3多代理分析器在多大程度上能够纠正和引导用户需求,以为静态图表生成可行的交互性?

主要发现

  • 三组件的Athanor体系结构能够通过自然语言有效地为静态可视化授权编写交互。
  • 行动-修改设计空间提供了一个结构化框架,用以将潜在交互描述为动作和修改。
  • 基于MLLM的约束解析器能够提取图表元素并将数据编码关系重新构建到约束模型中。
  • 多代理需求分析器通过翻译、纠错和引导用户输入至可行规范,提升准确性与可用性。
  • 案例研究与用户访谈表明该方法可用,且可覆盖多样化的交互性静态可视化需求。
Figure 2 : The design space of visualization modifications.
Figure 2 : The design space of visualization modifications.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。