QUICK REVIEW

[论文解读] Validating RDF with Shape Expressions.

Iovka Boneva, José Emilio Labra Gayo|arXiv (Cornell University)|Apr 4, 2014

Semantic Web and Ontologies参考文献 34被引用 23

一句话总结

本文提出了形状表达式（ShEx），一种使用正则袋表达式（RBEs）约束节点邻域的RDF图验证模式形式化。研究证明，对于使用单次出现正则袋表达式（SORBEs）的确定性模式，多类型验证是可有效处理的，而单类型验证仍属不可有效处理，但为基于确定性SORBE模式的单次遍历验证提供了高效方法。

ABSTRACT

We propose shape expression schema (ShEx), a novel schema formalism for describing the topology of an RDF graph that uses regular bag expressions (RBEs) to define con-straints on the admissible neighborhood for the nodes of a given type. We provide two alternative semantics, multi- and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more ex-pressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small class of RBEs. To further curb the high com-putational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable. Finally, we consider the problem of val-idating only a fragment of a graph with preassigned types for some of its nodes, and argue that for deterministic ShEx using SORBEs, multi-type validation can be performed efficiently and single-type validation can be performed with a single pass over the graph. 1

研究动机与目标

为解决对一种能够精确描述RDF图结构拓扑并施加邻域约束的模式语言的需求。
定义并比较两种语义——多类型与单类型——用于RDF模式验证，以捕捉节点是否可具有多个类型。
分析在这些语义下RDF验证的计算复杂度，并识别可有效处理的子类。
引入确定性与单次出现正则袋表达式（SORBEs），以降低验证复杂度。
实现对图片段的高效验证，尤其在节点类型预先分配的情况下，适用于确定性ShEx模式。

提出的方法

提出形状表达式（ShEx）作为使用正则袋表达式（RBEs）定义特定类型节点邻域约束的模式形式化。
定义两种语义：多类型（节点可具有多个类型）与单类型（节点恰好具有一个类型），其验证行为不同。
分析ShEx的表达能力，表明单类型语义比多类型语义更具表达力。
在ShEx模式中引入确定性概念，以限制模糊的路径模式，降低验证复杂度。
聚焦于单次出现正则袋表达式（SORBEs），即RBE的一个子类，可实现高效验证。
证明对于使用SORBE的确定性模式，多类型验证是可有效处理的，且可在图上单次遍历中完成；单类型验证亦可在单次遍历中实现。

实验结果

研究问题

RQ1在ShEx中，单类型语义相较于多类型语义的表达能力如何？
RQ2在多类型与单类型ShEx语义下，RDF图验证的计算复杂度是什么？
RQ3ShEx模式中的确定性是否能降低验证复杂度？若能，其条件是什么？
RQ4当仅验证图的片段且节点类型已预先分配时，是否可实现高效验证？
RQ5在单类型语义下，使用SORBE的确定性ShEx模式是否可实现单次遍历验证？

主要发现

在ShEx中，单类型语义比多类型语义更具表达力，因其可施加更精细的结构约束。
由于需对多种类型分配进行推理，单类型验证通常不可有效处理。
多类型验证仅对RBE的一个小类可行，但当引入确定性与SORBE时，可实现有效处理。
对于使用SORBE的确定性ShEx模式，多类型验证是可有效处理的，且可在图片段上高效执行。
对于使用SORBE的确定性单类型语义模式，单类型验证可通过单次遍历图实现，从而实现高效处理。
引入确定性与SORBE有效控制了验证的计算复杂度，使ShEx在现实世界RDF数据中具有实用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。