[Paper Review] PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
The paper fine-tunes a pre-trained BERT model for patent classification and shows state-of-the-art performance on large USPTO datasets, even using patent claims alone.
In this work we focus on fine-tuning a pre-trained BERT model and applying it to patent classification. When applied to large datasets of over two millions patents, our approach outperforms the state of the art by an approach using CNN with word embeddings. In addition, we focus on patent claims without other parts in patent documents. Our contributions include: (1) a new state-of-the-art method based on pre-trained BERT model and fine-tuning for patent classification, (2) a large dataset USPTO-3M at the CPC subclass level with SQL statements that can be used by future researchers, (3) showing that patent claims alone are sufficient for classification task, in contrast to conventional wisdom.
Motivation & Objective
- Motivate improving patent classification with modern NLP models.
- Demonstrate effectiveness of fine-tuned BERT over CNN-based approaches.
- Show that patent claims alone can achieve strong classification performance.
- Provide a large, reusable dataset (USPTO-3M) for CPC subclass classification.
Proposed method
- Fine-tune a pre-trained BERT model on patent data for CPC subclass classification.
- Compare with a CNN-based word-embedding approach as a baseline.
- Focus experiments on patent claims, excluding other patent document parts.
- Publish a large dataset (USPTO-3M) with SQL-ready statements for reuse.
Experimental results
Research questions
- RQ1Can fine-tuning a pre-trained BERT model outperform CNN-based methods for patent subclass classification?
- RQ2Are patent claims sufficient for effective patent classification compared to using full patent documents?
- RQ3What is the impact of dataset scale on classification performance in patent CPC subclasses?
Key findings
- Fine-tuned BERT-based approach achieves state-of-the-art performance on patent classification.
- The method outperforms a CNN with word embeddings baseline on large patent datasets.
- Patent claims alone are sufficient for the classification task, contrary to conventional wisdom.
- The authors provide USPTO-3M, a large dataset at the CPC subclass level, with SQL statements for future use.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.