[Paper Review] Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
The paper introduces WKLM, a weakly supervised pretraining objective that enforces entity-centric knowledge learning from unstructured text, improving entity-related QA and fine-grained entity typing over BERT baselines. It uses entity replacement training on Wikipedia to inject real-world entity knowledge without extra downstream memory or architecture changes.
Recent breakthroughs of pretrained language models have shown the effectiveness of self-supervised learning for a wide range of natural language processing (NLP) tasks. In addition to standard syntactic and semantic NLP tasks, pretrained models achieve strong improvements on tasks that involve real-world knowledge, suggesting that large-scale language modeling could be an implicit method to capture knowledge. In this work, we further investigate the extent to which pretrained models such as BERT capture knowledge using a zero-shot fact completion task. Moreover, we propose a simple yet effective weakly supervised pretraining objective, which explicitly forces the model to incorporate knowledge about real-world entities. Models trained with our new objective yield significant improvements on the fact completion task. When applied to downstream tasks, our model consistently outperforms BERT on four entity-related question answering datasets (i.e., WebQuestions, TriviaQA, SearchQA and Quasar-T) with an average 2.7 F1 improvements and a standard fine-grained entity typing dataset (i.e., FIGER) with 5.7 accuracy gains.
Motivation & Objective
- Motivate whether pretrained models implicitly capture real-world entity knowledge and quantify its extent via a zero-shot fact completion task.
- Introduce a weakly supervised knowledge learning objective that explicitly teaches models about real-world entities from unstructured text.
- Show that knowledge-enriched pretraining improves entity-related QA datasets and fine-grained entity typing beyond standard BERT baselines.
Proposed method
- Entity-centric pretraining with weak supervision via entity replacement: replace mentions with same-type entities and train the model to detect replacement.
- Use boundary-word representations of entities to predict P(e|C) and distinguish true vs false knowledge statements.
- Combine the knowledge-learning objective with masked language model (MLM) loss in a multi-task pretraining setup on Wikipedia and BooksCorpus.
- Maintain standard BERT architecture and no extra memory or architectural changes for downstream tasks.
- Perform ablations to compare WKLM against MLM-only and extended MLM baselines to isolate the knowledge-learning contribution.
Experimental results
Research questions
- RQ1Can large-scale pretraining encode explicit entity-level knowledge beyond standard MLM objectives?
- RQ2Does a weakly supervised knowledge-learning objective improve entity-related tasks without external knowledge bases?
- RQ3How does WKLM perform on zero-shot fact completion and downstream entity-centric QA and typing tasks compared to BERT and GPT-2?
- RQ4What is the impact of MLM ratio and, separately, entity-replacement objectives on downstream performance?
Key findings
- WKLM achieves best results on 8 of 10 fact-completion relations in zero-shot evaluation.
- On open-domain QA, WKLM outperforms BERT on entity-related datasets by an average of 2.7 F1 points when ranking scores are not used; with ranking, it attains near state-of-the-art results on three datasets.
- On fine-grained entity typing (FIGER), WKLM sets a new state-of-the-art with accuracy 60.21, Ma-F1 81.99, Mi-F1 77.00.
- Ablation shows that combining the WKLM objective with MLM yields the best downstream performance; using too high an MLM masking ratio (15%) can hurt knowledge learning.
- WKLM requires no additional data processing or memory during fine-tuning and works with the original BERT architecture.
- Compared to ERNIE, WKLM provides larger absolute gains on FIGER, suggesting text-based knowledge extraction is effective without external KBs.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.