QUICK REVIEW

[논문 리뷰] Towards Robustness Against Natural Language Word Substitutions

Xinshuai Dong, Anh Tuan Luu|arXiv (Cornell University)|2021. 07. 28.

Adversarial Robustness in Machine Learning참고 문헌 41인용 수 63

한 줄 요약

논문은 Adversarial Sparse Convex Combination (ASCC)를 도입하여 단어 치환 공격 공간을 볼록 껍질(convex hull)로 모델링하고 ASCC-defense를 적대적 학습과 함께 사용해 여러 NLP 작업 및 아키텍처에서 견고성을 향상시킨다.

ABSTRACT

Robustness against word substitutions has a well-defined and widely acceptable form, i.e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing. Previous defense methods capture word substitutions in vector space by using either $l_2$-ball or hyper-rectangle, which results in perturbation sets that are not inclusive enough or unnecessarily large, and thus impedes mimicry of worst cases for robust training. In this paper, we introduce a novel extit{Adversarial Sparse Convex Combination} (ASCC) method. We model the word substitution attack space as a convex hull and leverages a regularization term to enforce perturbation towards an actual substitution, thus aligning our modeling better with the discrete textual space. Based on the ASCC method, we further propose ASCC-defense, which leverages ASCC to generate worst-case perturbations and incorporates adversarial training towards robustness. Experiments show that ASCC-defense outperforms the current state-of-the-arts in terms of robustness on two prevailing NLP tasks, \emph{i.e.}, sentiment analysis and natural language inference, concerning several attacks across multiple model architectures. Besides, we also envision a new class of defense towards robustness in NLP, where our robustly trained word vectors can be plugged into a normally trained model and enforce its robustness without applying any other defense techniques.

연구 동기 및 목표

의미와 구문을 보존하는 단어 치환에 대한 강인성을 제고하는 것을 목표로 한다.
포괄적이면서도 압축적인 섭동을 포착하기 위해 치환 공간을 볼록 껍질로 모델링한다.
볼록 껍질 안의 적대적 예를 생성하고 이산 텍스트 공간과의 정렬을 촉진하기 위해 ASCC를 개발한다.
ASCC로 생성된 섭동을 이용한 적대적 학습을 통해 견고성을 개선하는 ASCC-defense를 제안한다.
다양한 데이터셋과 모델 아키텍처에서 견고성 향상을 입증한다.

제안 방법

단어의 치환을 치환 벡터의 볼록 껍질로 모델링한다. 어떤 적대적 벡터도 치환 벡터의 가중치 w_ij의 볼록 결합으로 표현한다.
가중치 제약을 소프트맥스 매개변수화로 완화하여 그래디언트 기반 최적화를 가능하게 한다.
w_i에 엔트로피 기반 정규화를 도입하여 희소성을 촉진하고 이산 치환과의 정합성을 높인다.
ASCC를 ASCC 섭동 하에서의 손실 최대화와 희소성 정규화(엔트로피 항)를 정의한다.
ASCC를 적대적 학습(ASCC-defense)에 내재시켜 ASCC 섭동에 대해 손실을 최대화하고 이를 최소화하여 견고한 매개변수를 얻는다.
안에 있는 최대화를 해결하기 위해 Adam으로 학습하고 외부 최소화를 견고성을 위해 수행한다.

실험 결과

연구 질문

RQ1임베딩 공간에서 단어 치환 섭동을 효과적으로 볼록 껍질로 포착하여 견고성을 높일 수 있는가?
RQ2볼록 결합 가중치의 희소성 정규화가 실제 치환을 더 잘 반영하는 섭동으로 이어지는가?
RQ3ASCC-defense가 여러 아키텍처에 걸쳐 일반적인 NLP 공격(Genetic, PWWS)에 더 견고한 모델을 생성할 수 있는가?
RQ4ASCC-defense를 통해 학습된 강인한 단어 벡터가 추가 방어 없이도 표준 모델로 견고성을 전이시키는가?

주요 결과

ASCC-defense는 IMDB 및 SNLI 작업과 여러 아키텍처에서 최첨단 방어 대비 일관되게 견고성을 향상시킨다.
IMDB의 Genetic 공격 하에서 LSTM을 사용하는 ASCC-defense가 79.0% 정확도를 달성해 이전의 75.0%를 상회한다.
ASCC-defense는 공격(Genetic 및 PWWS)과 아키텍처(LSTM, CNN, BOW, DCOM) 전반에 걸쳐 견고성을 보인다.
ASCC는 표준 모델의 초기화에 사용할 때 추가 방어 없이도 견고한 단어 벡터를 가능하게 하며 견고성을 향상시킨다(예: Genetics 공격에서 ASCC-V로 초기화된 LSTM의 73.4% 견고 정확도 vs Glove의 7.9%).
희소성 정규화 항이 섭동을 실제 치환에 가깝게 만들고 섭동을 이산 텍스트 공간과 정렬하도록 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.