QUICK REVIEW

[논문 리뷰] Evaluating the Zero-shot Robustness of Instruction-tuned Language Models

Jiuding Sun, Chantal Shaib|arXiv (Cornell University)|2023. 06. 20.

Topic Modeling인용 수 12

한 줄 요약

본 논문은 지시문-튜닝된 LLM들이 제로샷 작업에서 관측되지 않은 지시 문구에 민감하다는 것을 보여주고, 강인성을 향상시키기 위한 소프트 프롬프트 정렬 방법을 제안한다.

ABSTRACT

Instruction fine-tuning has recently emerged as a promising approach for improving the zero-shot capabilities of Large Language Models (LLMs) on new tasks. This technique has shown particular strength in improving the performance of modestly sized LLMs, sometimes inducing performance competitive with much larger model variants. In this paper we ask two questions: (1) How sensitive are instruction-tuned models to the particular phrasings of instructions, and, (2) How can we make them more robust to such natural language variation? To answer the former, we collect a set of 319 instructions manually written by NLP practitioners for over 80 unique tasks included in widely used benchmarks, and we evaluate the variance and average performance of these instructions as compared to instruction phrasings observed during instruction fine-tuning. We find that using novel (unobserved) but appropriate instruction phrasings consistently degrades model performance, sometimes substantially so. Further, such natural instructions yield a wide variance in downstream performance, despite their semantic equivalence. Put another way, instruction-tuned models are not especially robust to instruction re-phrasings. We propose a simple method to mitigate this issue by introducing ``soft prompt'' embedding parameters and optimizing these to maximize the similarity between representations of semantically equivalent instructions. We show that this method consistently improves the robustness of instruction-tuned models.

연구 동기 및 목표

테스트 시점에 새롭고 의미적으로 동등한 지시문에 대해 Flan, Alpaca, T0 계열의 지시-튜닝 언어 모델이 어떻게 반응하는지 평가한다.
관측되지 않은 지시문 사용 시 MMLU 및 Big-Bench Lite 벤치마크에서 강인성 저하를 정량화한다.
소프트 프롬프트를 통해 의미적으로 동등한 지시문의 표현을 정렬하여 강인성을 개선하는 경량화 방법을 제안한다.
모델 크기가 확장되고 ICL이 도입될 때 강인성이 개선되는지 평가한다.
향후 연구를 지원하기 위해 강인성 분석을 위해 수집된 지시 데이터셋을 공개한다.

제안 방법

관측되지 않은 지시문을 만들기 위해 36명의 NLP 연구자로부터 75개 작업에 대해 319개 수기 작성 지시문을 수집한다.
Flan-T5, Alpaca, T0 변형을 사용하여 MMLU 및 Big-Bench Lite에서 관측된 지시문과 관측되지 않은 지시문의 차이를 평가한다.
관측된 지시문과 관측되지 않은 지시문 간 표현 유사도("penultimate layer", tSNE)를 분석한다.
의미적으로 동등한 지시문을 정렬하기 위해 KL-divergence를 포함한 소프트 프롬프트 정렬 목표를 도입한다.
기저 모델을 고정한 채 소프트 프롬프트 매개변수(프리픽스 토큰)만 미세 조정한다.
일치성 학습을 위한 패러프레이즈 세트를 생성하기 위해 GPT-4를 사용하여 참조 지시문의 바꿔말을 생성한다.

Figure 1 : How well do models trained on instruction-tuning datasets generalize to novel instructions (unobserved in training)? Our analysis suggests that they do not do so very well. Above we show a case where pairing an example with an observed instruction yields the correct output, while providin

실험 결과

연구 질문

RQ1테스트 시점에 지시 문구의 변형에 대해 지시-튜닝된 LM은 얼마나 민감한가?
RQ2의미적으로 동등하지만 새로운 지시문이 모델 계열과 벤치마크 전반에서 제로샷 성능을 저하시킬까?
RQ3전체 모델 미세 조정 없이도 경량 정렬 목표가 관측되지 않은 지시문에 대한 강인성을 향상시킬 수 있을까?
RQ4확대 또는 컨텍스트 내 학습(ICL)으로 강인성이 개선되는가?

주요 결과

관측되지 않거나 의미적으로 동등한 지시문은 여러 모델과 작업에서 일관되게 정확도를 저하시킨다(여러 설정에서 평균 저하가 5점 이상).
분류 작업은 보이지 않는 지시 문구에 특히 영향을 받으며 BC/MC 작업에서 더 큰 저하가 관측된다.
학습 가능한 프리픽스 임베딩과 KL-divergence 손실을 도입한 간단한 소프트 프롬프트 정렬 방법은 강인성을 개선하고 관측되지 않은 지시문에 대한 성능 격차를 줄인다.
11B까지 모델 크기가 커져도 강인성이 완전히 사라지지 않으므로 더 큰 모델이나 추가 기법이 필요할 수 있음을 시사한다.
ICL은 미지의 지시문에 대한 민감성을 약간 완화하지만 강인성 격차를 제거하지는 못한다.
의미적으로 동등한 지시문의 표현을 명시적으로 정렬하는 것이 정확도 향상과 상관관계가 있다(관측된/관측되지 않은 표현에 더 근접).

(a) Average zero-shot performance over all tasks when using observed and unobserved instructions.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.