QUICK REVIEW

[논문 리뷰] A Deep Patent Landscaping Model using Transformer and Graph Convolutional Network

Seokkyu Choi, Hyeonju Lee|arXiv (Cornell University)|2019. 03. 14.

Intellectual Property and Patents참고 문헌 9인용 수 1

한 줄 요약

이 논문은 특허 텍스트 분석을 위한 수정된 Transformer와 특허 메타데이터를 위한 그래프 컬러션 네트워크(GCN)를 통합한 딥러닝 모델을 제안하며, 특허 랜드스케이핑을 자동화한다. 12개의 새로 정제된 벤치마크 데이터셋에서 평가된 결과, 평균 분류 정확도가 98%로 최신 기술 수준을 달성하였다.

ABSTRACT

Patent landscaping is a method that is employed for searching related patents during the process of a research and development (R&D) project. To avoid the risk of patent infringement and to follow the current trends of technology development, patent landscaping is a crucial task that needs to be conducted during the early stages of an R&D project. Generally, the process of patent landscaping requires several advanced resources and can be tedious. Furthermore, the patent landscaping process has to be repeated throughout the duration of an R&D project. Owing to such reasons, the demand for automated patent landscaping is gradually increasing. However, the shortage of well-defined benchmarking datasets and comparable models makes it difficult to find related research studies. In this paper, an automated patent landscaping model based on deep learning is proposed. The proposed model comprises a modified transformer structure for analyzing textual data present in patent documents and a graph convolutional network for analyzing patent metadata. Twelve patent landscaping benchmarking datasets, which were processed by the Korean patent attorney, are proposed for determining the resources required for comparing related research studies. Obtained results indicate that the proposed model with the proposed datasets can attain state-of-the-art performance , and mean classification accuracy of 98% can be achieved.

연구 동기 및 목표

연구개발 프로젝트에서 침해를 피하고 기술 트렌드를 추적하기 위해 증가하는 자동 특허 랜드스케이핑 수요를 해결하기 위해.
특허 랜드스케이핑 연구 분야에서 잘 정의된 벤치마크 데이터셋과 비교 가능한 모델의 부족을 해결하기 위해.
특허 문서의 텍스트 및 메타데이터 특징을 효과적으로 통합하는 딥러닝 프레임워크를 개발하기 위해.
향후 연구를 위한 재현 가능하고 확장 가능한 벤치마크 프레임워크를 구축하기 위해.

제안 방법

수정된 Transformer 아키텍처를 사용하여 특허 문서의 텍스트 콘텐츠에서 의미적 표현을 처리하고 추출한다.
소속 기관, 발명가, 기술 분류와 같은 구조화된 메타데이터를 사용하여 특허 간의 관계를 모델링하기 위해 그래프 컬러션 네트워크(GCN)를 적용한다.
Transformer에서 유도된 텍스트 임베딩과 GCN에서 유도된 그래프 기반 표현을 융합하여 통합된 특허 임베딩을 생성한다.
학습된 표현 기반으로 특허를 관련 기술 분야에 분류하기 위해 모델을 엔드 투 엔드로 훈련한다.
12개의 벤치마크 데이터셋은 한국 특허 변호사들이 기술 분야의 다양성을 고려하여 표준화된 평가를 위해 정제하였다.

실험 결과

연구 질문

RQ1딥러닝 모델이 자연어 이해와 그래프 기반 관계를 효과적으로 통합하여 특허 랜드스케이핑 정확도를 향상시킬 수 있는가?
RQ2표준화된 특허 데이터셋에서 제안된 모델은 기존 방법과 비교해 분류 성능 면에서 어떻게 다른가?
RQ3텍스트 및 메타데이터 특징의 통합이 특허 분석에서 관련 기초기술을 탐지하는 데 얼마나 기여하는가?
RQ4제안된 벤치마크 데이터셋은 향후 자동 특허 랜드스케이핑 시스템의 평가 및 비교에 적합한가?

주요 결과

제안된 모델은 12개의 벤치마크 데이터셋에서 평균 분류 정확도가 98%를 기록하여 최신 기술 수준의 성능을 입증하였다.
Transformer 기반 텍스트 모델링과 GCN 기반 메타데이터 분석의 통합은 별도로 적용할 경우보다 분류 신뢰도를 크게 향상시켰다.
정제된 벤치마크 데이터셋은 향후 자동 특허 랜드스케이핑 연구를 위한 표준화되고 신뢰할 수 있는 평가 프레임워크를 제공한다.
모델의 높은 정확도는 전문가가 처리한 데이터셋을 통해 검증된 바와 같이 다양한 기술 분야로의 일반화 능력이 뛰어나다는 것을 시사한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.