Skip to main content
QUICK REVIEW

[论文解读] Orthogonal Hierarchical Decomposition for Structure-Aware Table Understanding with Large Language Models

Bin Cao, Huixian Lu|arXiv (Cornell University)|Feb 2, 2026
Data Quality and Management被引用 0
一句话总结

本文提出 Orthogonal Hierarchical Decomposition (OHD) 框架,包含 Orthogonal Tree Induction (OTI) 和双路径关联协议,使 LLMs 具备结构感知的表格理解,在 AITQA 与 HiTab 上达到最先进结果。

ABSTRACT

Complex tables with multi-level headers, merged cells and heterogeneous layouts pose persistent challenges for LLMs in both understanding and reasoning. Existing approaches typically rely on table linearization or normalized grid modeling. However, these representations struggle to explicitly capture hierarchical structures and cross-dimensional dependencies, which can lead to misalignment between structural semantics and textual representations for non-standard tables. To address this issue, we propose an Orthogonal Hierarchical Decomposition (OHD) framework that constructs structure-preserving input representations of complex tables for LLMs. OHD introduces an Orthogonal Tree Induction (OTI) method based on spatial--semantic co-constraints, which decomposes irregular tables into a column tree and a row tree to capture vertical and horizontal hierarchical dependencies, respectively. Building on this representation, we design a dual-pathway association protocol to symmetrically reconstruct semantic lineage of each cell, and incorporate an LLM as a semantic arbitrator to align multi-level semantic information. We evaluate OHD framework on two complex table question answering benchmarks, AITQA and HiTab. Experimental results show that OHD consistently outperforms existing representation paradigms across multiple evaluation metrics.

研究动机与目标

  • 解决多级表头、合并单元格和不规则布局的复杂表格理解挑战。
  • 将表格结构分解为独立的列层级和行层级,以保持层级语义。
  • 通过双路径关联和基于 LLM 的仲裁重建表格单元的语义谱系。
  • 在基准数据集(AITQA、HiTab)上展示对非常规表布局的鲁棒性。
  • 通过消融研究验证语义谓词、双路径和仲裁等贡献。

提出的方法

  • 提出 Orthogonal Hierarchical Decomposition (OHD),按语义-空间协同作用引导将表格因式分解为列树与行树。
  • 开发 Orthogonal Tree Induction (OTI),包含两个阶段:标题骨架诱导和自适应数据锚定,利用语义谓词与空间约束构建树。
  • 应用 Dual-Pathway Association Reconstruction,为每个数据单元从主轴与正交轴构建结构化上下文,具备边界感知锚定。
  • 使用多路径语义仲裁,在双输入下由 LLM 细化并综合最终的结构感知表示。
  • 通过对 LLM 的零-shot 提示提供最终结构感知的文本代理,优化逻辑连贯性、完整性与可读性。
Figure 1 : Illustration of table complexity and structural diversity. The examples encompass several challenging non-standard layouts. (a) : Tables featuring multi-level nested column headers and merged data cells; (b) : Tables characterized by deep hierarchical row header structures; (c) : Complex
Figure 1 : Illustration of table complexity and structural diversity. The examples encompass several challenging non-standard layouts. (a) : Tables featuring multi-level nested column headers and merged data cells; (b) : Tables characterized by deep hierarchical row header structures; (c) : Complex

实验结果

研究问题

  • RQ1如何将具有分层表头和不规则布局的复杂表格分解为保持语义的正交结构表示?
  • RQ2正交(行/列)层级结合双路径关联是否能提升 LLM 对非标准表格的推理能力?
  • RQ3语义谓词和基于 LLM 的仲裁对表格问答性能有何影响?
  • RQ4在与线性化和基于模式的基线相比时,OHD 在像 AITQA 和 HiTab 这样的挑战性基准上的表现如何?

主要发现

MethodAITQA EMAITQA LLM Eval Avg.HiTab EMHiTab LLM Eval Avg.HiTab Subset EMHiTab Subset LLM Eval Avg.
Chain-of-Table (Qwen2-72b)49.3262.0244.2662.9250.2567.06
E5 (Qwen2-72b)56.4058.9743.5647.9350.1358.78
St-Raptor (Qwen2-72b)60.5571.0353.8360.7155.7361.93
Ours (Qwen2-72b)69.3489.1260.0767.1564.7470.66
TableLLaMA-7B68.3585.6164.7166.9966.7571.56
Ours (TableLLaMA-7B)73.8387.9563.6266.2468.3774.23
  • 在 AITQA 和 HiTab 的 EM 和基于 LLM 的评估中,OHD 相对于基线取得更优的性能。
  • 使用双正交树(列树和行树)显著提升对灵活表头和非典型布局的鲁棒性。
  • 语义谓词和基于 LLM 的仲裁至关重要;移除它们会显著降低性能。
  • 与 OHD 的双路径拓扑相比,直接使用 lineage 表示(Markdown/HTML)在复杂表格上效果明显较差。
  • 消融结果显示在所有骨干模型(Qwen2-72B 与 TableLLaMA-7B)上,完整的 OHD 方案均取得最佳结果。
Figure 2 : Workflow of the OHD framework. The process begins with a Categorized Table Input where each cell is pre-identified as a Row Header, Column Header, or Data unit. The pipeline then proceeds in three stages: (1) Orthogonal Tree Induction (OTI) to decompose the table into independent row and
Figure 2 : Workflow of the OHD framework. The process begins with a Categorized Table Input where each cell is pre-identified as a Row Header, Column Header, or Data unit. The pipeline then proceeds in three stages: (1) Orthogonal Tree Induction (OTI) to decompose the table into independent row and

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。