Skip to main content
QUICK REVIEW

[论文解读] A Checklist to Publish Collections as Data in GLAM Institutions

Gustavo Candela, Nele Gabriëls|arXiv (Cornell University)|Apr 5, 2023
Advanced Data Storage Technologies被引用 8
一句话总结

本文提出一种方法,帮助 GLAM 机构创建并应用实用的核对清单,将数字收藏以数据形式发表,帮助中小型机构实现计算重用。

ABSTRACT

Large-scale digitization in Galleries, Libraries, Archives and Museums (GLAM) created the conditions for providing access to collections as data. It opened new opportunities to explore, use and reuse digital collections. Strong proponents of collections as data are the Innovation Labs which provided numerous examples of publishing datasets under open licenses in order to reuse digital content in novel and creative ways. Within the current transition to the emerging data spaces, clouds for cultural heritage and open science, the need to identify practices which support more GLAM institutions to offer datasets becomes a priority, especially within the smaller and medium-sized institutions. This paper answers the need to support GLAM institutions in facilitating the transition into publishing their digital content and to introduce collections as data services; this will also help their future efficient contribution to data spaces and cultural heritage clouds. It offers a checklist that can be used for both creating and evaluating digital collections suitable for computational use. The main contributions of this paper are i) a methodology for devising a checklist to create and assess digital collections for computational use; ii) a checklist to create and assess digital collections suitable for use with computational methods; iii) the assessment of the checklist against the practice of institutions innovating in the Collections as data field; and iv) the results obtained after the application and recommendations for the use of the checklist in GLAM institutions.

研究动机与目标

  • 促使 GLAM 机构认识到 Collections as Data 的需求,并识别发表实践的多样性。
  • 开发一个简单、易用的核对清单,用于创建和评估适合计算使用的数据集。
  • 识别来自 GLAM 从业者的信息需求与问题,以 informing 核对清单及其使用。
  • 描述构建、测试和在 GLAM 数据集与实践中应用核对清单的方法论。

提出的方法

  • 进行了文献综述,以确定将收藏作为数据发表相关的最佳实践、数据质量考虑因素和现有核对清单。
  • 对 GLAM 和研究机构进行了观察性调查(2022 年 10 月),以了解经验和信息需求。
  • 通过综合文献结果与从业者洞见(四阶段过程)构建核对清单。
  • 以结构化表格呈现核对清单,并为每一项提供详细解释。
  • 将核对清单应用于评估现有 GLAM 数据集,并在机构情境中演示其用法。
Figure 1. Survey results ”What is the level of your experience with preparing Collections as data?”
Figure 1. Survey results ”What is the level of your experience with preparing Collections as data?”

实验结果

研究问题

  • RQ1将 GLAM 收藏品以数据形式进行计算使用的关键实践与考量有哪些?
  • RQ2如何设计一个简单、可操作的核对清单,以支持中小型 GLAM 机构?
  • RQ3从业者在实施 Collections as Data 时遇到的信息需求和常见问题有哪些?
  • RQ4如何将核对清单应用于创建和评估机器可操作的收藏?
  • RQ5在进入数据空间和文化遗产云方面,应用核对清单有何影响?

主要发现

  • 开发出一个清晰、可操作的包含 11 项的核对清单,用于引导许可、引用、文档、访问、结构、机器可读元数据、协作平台、API、门户页面和使用条款。
  • 调查结果显示,许多 GLAM 受访者在 Collections as data 方面经验不足,初始阶段感到信息不足,凸显需要结构化的指导。
  • 从业者输入确定了在数据准备、数据集结构、标准与文档方面的主要需求,以促进入手 Collections as data。
  • 该核对清单既支持数据集的创建,也支持评估,为具有不同成熟度的机构提供优先级路径。
  • 示例与文档(包括用例和实验环境)对推动重复使用和研究人员之间的协作具有核心作用。
Figure 2. Survey results ”How well-informed do you feel/did you feel when starting to move towards Collections as data?”
Figure 2. Survey results ”How well-informed do you feel/did you feel when starting to move towards Collections as data?”

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。