QUICK REVIEW

[Paper Review] Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge

Mohamad Yaser Jaradeh, Allard Oelen|arXiv (Cornell University)|Jan 1, 2019

Semantic Web and Ontologies48 references35 citations

TL;DR

This paper introduces the Open Research Knowledge Graph (ORKG), a next-generation infrastructure that captures scholarly knowledge in machine-actionable, structured form through a hybrid approach combining crowdsourcing and automated natural language processing. The user evaluation shows strong user acceptance and interest in contributing structured research contributions, validating the feasibility of scalable, fine-grained scholarly knowledge curation in a knowledge graph environment.

ABSTRACT

Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. In this form, scholarly knowledge is hard to process automatically. We present the first steps towards a knowledge graph based infrastructure that acquires scholarly knowledge in machine actionable form thus enabling new possibilities for scholarly knowledge curation, publication and processing. The primary contribution is to present, evaluate and discuss multi-modal scholarly knowledge acquisition, combining crowdsourced and automated techniques. We present the results of the first user evaluation of the infrastructure with the participants of a recent international conference. Results suggest that users were intrigued by the novelty of the proposed infrastructure and by the possibilities for innovative scholarly knowledge processing it could enable.

Motivation & Objective

To address the limitations of document-based scholarly communication by enabling machine-processable, semantically rich scholarly knowledge representations.
To develop an infrastructure that captures scholarly knowledge at the point of creation, particularly during the research lifecycle.
To evaluate user willingness and system usability for contributing structured research contributions through a crowdsourced knowledge curation platform.
To integrate automated NLP techniques with human input to enable multi-modal scholarly knowledge acquisition.

Proposed method

The ORKG infrastructure models scholarly knowledge as structured 'ResearchContributions' with key elements like problem, method, and result.
It employs a front-end interface that allows users to manually curate and annotate research contributions using guided forms.
Automated NLP pipelines—including named entity recognition and relation extraction from tools like Textrazor and MeaningCloud—are integrated to pre-extract and suggest content.
The system supports multi-modal input, including text, tables, and figures, to enrich knowledge graph nodes.
A mock-up interface (Figure 5) demonstrates real-time integration of automated text extraction to guide user input.
The infrastructure supports versioning, user authentication, and future integration with nanopublications and scholarly visualization tools.

Experimental results

Research questions

RQ1Are authors willing to contribute structured descriptions of the key research contribution(s) published in their articles using a fit-for-purpose infrastructure, and what is the user acceptance of the infrastructure?
RQ2Can the infrastructure effectively integrate crowdsourcing and automated techniques for multi-modal scholarly knowledge acquisition?
RQ3How can automated information extraction be integrated into the user workflow to support and ease manual curation?

Key findings

User evaluation at a conference workshop showed strong positive feedback on the ORKG front end, with all usability aspects rated above average except 'Guidance Needed', indicating high system usability.
Participants expressed interest in the ORKG and suggested integrating it with digital libraries and academic institutions, indicating potential for institutional adoption.
The coverage metric for entity linking showed Falcon as the most promising NLP tool, with a ζ (coverage) value of 0.78 on the gold-standard dataset.
The prototype successfully demonstrated the integration of automated NLP features into the user workflow, such as highlighting relevant text zones to support input.
The results confirm that authors are willing to contribute structured knowledge, supporting the feasibility of large-scale, crowdsourced curation of scholarly knowledge.
The system is extensible and supports future enhancements like discussion features, versioning, and interoperability with nanopublications.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.