QUICK REVIEW

[論文レビュー] Unveiling the Potential of Sentiment: Can Large Language Models Predict Chinese Stock Price Movements?

Haohan Zhang, Fengrui Hua|arXiv (Cornell University)|Jun 25, 2023

Stock Market Forecasting Methods被引用数 8

ひとこと要約

この論文は中国の金融ニュースから感情を抽出する3つのLLMアプローチ（ChatGPTベースライン、Erlangshen-RoBERTa Chineseモデル、Chinese FinBERT）をベンチマークし、標準化されたバックテストを通じて取引パフォーマンスを評価。Erlangshen-110M-Sentimentが最も効果的であると結論付ける。

ABSTRACT

The rapid advancement of Large Language Models (LLMs) has spurred discussions about their potential to enhance quantitative trading strategies. LLMs excel in analyzing sentiments about listed companies from financial news, providing critical insights for trading decisions. However, the performance of LLMs in this task varies substantially due to their inherent characteristics. This paper introduces a standardized experimental procedure for comprehensive evaluations. We detail the methodology using three distinct LLMs, each embodying a unique approach to performance enhancement, applied specifically to the task of sentiment factor extraction from large volumes of Chinese news summaries. Subsequently, we develop quantitative trading strategies using these sentiment factors and conduct back-tests in realistic scenarios. Our results will offer perspectives about the performances of Large Language Models applied to extracting sentiments from Chinese news texts.

研究の動機と目的

中国の金融ニュースから感情因子を抽出して取引意思決定に活用する際のLLMの有効性を評価する。
モデル間で客観的比較を行うための標準化されたベンチマークとバックテスト手順を提供する。
このタスクで、生成型LLM、言語特化型事前学習LLM、金融分野で微調整されたLLMを比較する。

提案手法

市場開前の394,429件の中国語ニュース要約から感情抽出に3モデルを適用する。
ChatGPTの場合、感情をGood (1)、Not Sure (0)、Bad (-1)として分類するプロンプトを用い、出所間で平均化する。
WuDao Chinese corpusで事前学習された Erlangshen-RoBERTa-110M-Sentiment を用いて感情確率を出力する。
中国FinBERT を開発し、手動ラベル付けデータで訓練したドメイン特化の微調整分類器とする。
感情ランキングから取引ポートフォリオを構築し、標準化された取引パラメータでバックテストを実施する。
過剰リターン、リスク調整後リターン、勝率を統一フレームワークで評価する。

Figure 1: Demonstration of Prompts Structured for Sentiment Analysis and the Response by ChatGPT

実験結果

リサーチクエスチョン

RQ1異なるLLM（生成型、言語特化事前学習、ドメイン特化微調整）は、中国の金融ニュースから感情因子を効果的に抽出できるか。
RQ2感情由来因子が標準化されたバックテストフレームワーク下で取引パフォーマンスへどのように変換されるか。
RQ3中国の金融分野における感情抽出には言語特化前提・ドメイン特化前提のほうが大規模モデルに依存せずに有利か。

主な発見

Factor Name	Annual Excess Return (%)	Annual Net Asset Return (%)	Win Rate(%)	Sharpe Ratio
Chinese-GPT	23.1	11.04	57.49	0.6406
Chinese-FinBERT	19.79	7.73	57.19	0.4797
Erlangshen-110M	24.01	11.95	58.38	0.678

Erlangshen-110M-Sentimentは、年次過剰リターン、年次純資産リターン、勝率、シャープ比のすべてで他のファクターを上回る。
グループ分析では、Erlangshenファクター値が高いほど過剰リターンが高くなる傾向が一貫して見られる。
ベンチマーク内で小さな Erlangshen モデルが大規模モデルに比べて優れたパフォーマンスを示す。
言語特化の事前学習とドメイン特化の微調整は、非常に大規模モデルに依存せずとも中国金融分野で強力な感情シグナルを生み出す可能性がある。

Figure 2: Excess Returns of All Three Sentiment Factors

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。