QUICK REVIEW

[논문 리뷰] Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models

Boyu Zhang, Hongyang Yang|arXiv (Cornell University)|2023. 06. 22.

Stock Market Forecasting Methods인용 수 11

한 줄 요약

이 논문은 일반 목적의 LLM(LLaMA-7B)을 소규모 금융 감성 데이터셋으로 지시 튜닝하여 FinBERT와 ChatGPT를 금융 감성 분석에서 능가하며, 숫자 민감도와 맥락 이해에 중점을 둔다.

ABSTRACT

Sentiment analysis is a vital tool for uncovering insights from financial articles, news, and social media, shaping our understanding of market movements. Despite the impressive capabilities of large language models (LLMs) in financial natural language processing (NLP), they still struggle with accurately interpreting numerical values and grasping financial context, limiting their effectiveness in predicting financial sentiment. In this paper, we introduce a simple yet effective instruction tuning approach to address these issues. By transforming a small portion of supervised financial sentiment analysis data into instruction data and fine-tuning a general-purpose LLM with this method, we achieve remarkable advancements in financial sentiment analysis. In the experiment, our approach outperforms state-of-the-art supervised sentiment analysis models, as well as widely used LLMs like ChatGPT and LLaMAs, particularly in scenarios where numerical understanding and contextual comprehension are vital.

연구 동기 및 목표

일반 목적의 LLM에 대한 지시 튜닝이 금융 감성 분석을 개선할 수 있음을 보여준다.
금융 텍스트의 숫자 민감성을 다루어 숫자로부터의 감정 해석을 개선한다.
LLM의 선행 지식에 의해 강화된 맥락 이해의 역할을 평가한다.
금융 감성 작업에서 지시 튜닝된 LLaMA-7B를 FinBERT 및 ChatGPT와 비교한다.

제안 방법

감성 분류 데이터세트를 10개의 사람이 작성한 지시로 구성된 지시-튜닝 형식으로 변환한다.
형식화된 지시 데이터에 대해 감독형 시퀀스-투-시퀀스 손실을 사용해 LLaMA-7B를 미세조정한다.
자가회귀 출력값을 세 가지 감정 레이블(positive, negative, neutral)로 매핑한다.
맥락 및 숫자 민감도를 평가하기 위해 FinBERT 및 LLaMA-7B와의 비교 평가를 수행한다.
지정된 하이퍼파라미터로 10 에폭에 걸쳐 훈련하기 위해 DeepSpeed와 함께 8 A100 GPU를 사용한다.

실험 결과

연구 질문

RQ1지시 튜닝된 LLM을 사용하여 금융 감성 분석에서 숫자 민감도를 어떻게 개선할 수 있는가?
RQ2일반 LLM 지식으로 얻은 맥락 이해가 금융 감성 예측에 미치는 영향은 무엇인가?
RQ3지시 튜닝된 FinGPT가 전통적 FinBERT 및 일반 LLM과 비교했을 때 금융 감성 작업에서 어떤 차이가 있는가?
RQ4적은 양의 지시 데이터로 일반 목적 LLM에서 최첨단 성능을 낼 수 있는가?

주요 결과

Name	Size	Metrics	FinBERT	LLaMA-7B	Instruct-FinGPT-7B
Twitter Val	2388	Acc / F1 / Testing Time	0.725 / 0.668 / 18 seconds (1 GPU)	0.54 / 0.36 / 498 seconds (8 GPUs)	0.880 / 0.841 / 498 seconds (8 GPUs)
Numerical	117	Acc / F1	0.633 / 0.630	0.60 / 0.42	0.837 / 0.795
Contextual	20	Acc / F1	0.50 / 0.22	0.55 / 0.34	0.80 / 0.63

Instruct-FinGPT-7B는 모든 평가 데이터셋에서 정확도와 F1 측면에서 FinBERT 및 LLaMA-7B를 능가한다.
모델은 강한 숫자 민감도를 보이며, 여러 예에서 금융 숫자와 관련된 감정을 정확히 해석한다.
지시 튜닝된 LLM의 맥락 이해로 맥락이 없거나 모호할 때 감정 해석이 더 잘된다.
Zero-shot FPB 평가에서 Instruct-FinGPT-7B가 ChatGPT-3.5 및 LLaMA-7B보다 우세하여 일반화가 좋음을 시사한다.
학습 소요는 소량의 지시 데이터로 8 A100 GPU에서 약 58분 정도로 modest하다.
이 방식은 BloombergGPT에 비해 현저히 적은 학습 자원으로 우수한 성능을 달성한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.