Skip to main content
QUICK REVIEW

[Paper Review] Wembedder: Wikidata Entity Embedding Web Service

Finn Årup Nielsen|arXiv (Cornell University)|Oct 11, 2017
Topic Modeling15 references11 citations
TL;DR

Wembedder is a RESTful web service that provides pre-trained, multilingual entity embeddings for over 600,000 Wikidata items and properties, generated using Word2Vec on graph walks of the Wikidata knowledge graph. The service enables efficient semantic similarity queries across multilingual knowledge graph entities via a scalable, accessible API.

ABSTRACT

I present a web service for querying an embedding of entities in the Wikidata knowledge graph. The embedding is trained on the Wikidata dump using Gensim's Word2Vec implementation and a simple graph walk. A REST API is implemented. Together with the Wikidata API the web service exposes a multilingual resource for over 600'000 Wikidata items and properties.

Motivation & Objective

  • To provide a scalable, accessible web service for semantic embeddings of Wikidata entities and properties.
  • To enable multilingual semantic similarity queries across Wikidata's structured knowledge graph.
  • To integrate with the Wikidata API for enhanced knowledge retrieval and interoperability.
  • To support researchers and developers with pre-trained, transferable embeddings for downstream NLP and knowledge graph applications.

Proposed method

  • Training entity embeddings using Gensim's Word2Vec on random walks over the Wikidata knowledge graph.
  • Constructing graph walks that traverse Wikidata entities and their relationships to capture semantic and structural context.
  • Generating dense vector representations for Wikidata items and properties in a multilingual setting.
  • Exposing the embeddings through a standardized REST API for programmatic access.
  • Combining the Wembedder API with the Wikidata API to enable cross-referenced knowledge retrieval.

Experimental results

Research questions

  • RQ1How can entity embeddings from Wikidata be efficiently exposed as a web service for broad accessibility?
  • RQ2To what extent do graph walk-based Word2Vec embeddings capture meaningful semantic relationships in Wikidata?
  • RQ3Can a multilingual embedding model trained on Wikidata support effective semantic similarity queries across diverse languages?
  • RQ4How does the integration of Wembedder with the Wikidata API enhance knowledge graph querying capabilities?

Key findings

  • The Wembedder service successfully exposes pre-trained embeddings for over 600,000 Wikidata entities and properties via a REST API.
  • The embeddings are multilingual and trained using graph walks, enabling semantic similarity computation across diverse language entries.
  • The service integrates seamlessly with the Wikidata API, enhancing semantic search and knowledge retrieval workflows.
  • The use of Word2Vec on Wikidata's graph structure yields meaningful, transferable vector representations for knowledge graph applications.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.