QUICK REVIEW

[论文解读] Is ChatGPT a Good Personality Recognizer? A Preliminary Study

Yu Ji, Wen Wu|arXiv (Cornell University)|Jul 8, 2023

Machine Learning in Healthcare被引用 20

一句话总结

本论文评估 ChatGPT 从文本中识别大五人格特质的能力，使用各种提示策略，与基线和 SOTA 模型进行比较，分析公平性，并考察对下游任务的影响。

ABSTRACT

In recent years, personality has been regarded as a valuable personal factor being incorporated into numerous tasks such as sentiment analysis and product recommendation. This has led to widespread attention to text-based personality recognition task, which aims to identify an individual's personality based on given text. Considering that ChatGPT has recently exhibited remarkable abilities on various natural language processing tasks, we provide a preliminary evaluation of ChatGPT on text-based personality recognition task for generating effective personality data. Concretely, we employ a variety of prompting strategies to explore ChatGPT's ability in recognizing personality from given text, especially the level-oriented prompting strategy we designed for guiding ChatGPT in analyzing given text at a specified level. The experimental results on two representative real-world datasets reveal that ChatGPT with zero-shot chain-of-thought prompting exhibits impressive personality recognition ability and is capable to provide natural language explanations through text-based logical reasoning. Furthermore, by employing the level-oriented prompting strategy to optimize zero-shot chain-of-thought prompting, the performance gap between ChatGPT and corresponding state-of-the-art model has been narrowed even more. However, we observe that ChatGPT shows unfairness towards certain sensitive demographic attributes such as gender and age. Additionally, we discover that eliciting the personality recognition ability of ChatGPT helps improve its performance on personality-related downstream tasks such as sentiment classification and stress prediction.

研究动机与目标

将基于文本的人格识别作为 NLP 任务中有价值的下游信号进行动机阐释。
评估在不同提示策略下，ChatGPT 从用户生成的文本推断大五人格特质的能力。
在两个数据集上将 ChatGPT 与 RNN、RoBERTa 以及 SOTA 模型进行比较。
通过自然语言解释探索 ChatGPT 输出的可解释性。
研究跨性别和年龄组的公平性偏差及潜在的下游任务收益。

提出的方法

使用零-shot、零-shot 链式推理(CoT) 和一-shot 提示从文本中引出人格维度（O、C、E、A、N）。
设计面向水平的零-shot CoT 提示，以在词、句子或文档级别分析文本。
在 Essays 与 PAN 数据集上，与基线（RNN、RoBERTa）及 SOTA（HPMN BERT）进行对比评估。
衡量准确性并相对于 SOTA 计算 AIP（准确性提升百分比）。
为 CoT 提示提供带有自然语言解释的 ChatGPT 输出。
通过加入性别和年龄提示来分析公平性，并可视化预测特质水平的分布。

实验结果

研究问题

RQ1RQ1：不同提示策略如何影响 ChatGPT 从文本识别人格的能力？
RQ2RQ2：在敏感人口属性上，作为人格识别器的 ChatGPT 有多么不公平？
RQ3RQ3：推断的人格是否提升对下游任务的表现，如情感分类和压力预测？

主要发现

模型	O	C	E	A	N	平均值
RNN	57.3%	52.8%	45.2%	45.2%	50.8%	50.3%
RoBERTa	64.9%	52.8%	51.2%	58.1%	59.7%	57.3%
SOTA (HPMN BERT)	81.8%	79.6%	81.1%	80.7%	81.7%	80.9%
ChatGPT ZS	60.9%	56.0%	50.8%	58.9%	60.5%	57.4%
ChatGPT CoT	65.7%	53.2%	49.2%	60.9%	60.1%	57.8%
ChatGPT OS	58.4%	54.5%	59.0%	58.8%	60.5%	58.2%
ChatGPT CoT_W	59.3%	56.5%	50.4%	58.9%	61.3%	57.3%
ChatGPT CoT_S	62.1%	55.2%	51.6%	59.3%	58.9%	57.4%
ChatGPT CoT_D	64.1%	56.5%	51.2%	59.7%	60.1%	58.3%

带有 CoT 的零-shot 提示在多提示策略中实现了最佳平均绩效，但仍低于 SOTA。
零-shot CoT 提示能够生成自然语言解释并提升预测的可解释性。
面向层级的 CoT 提示（词级/句子级/文档级）可进一步提升面向特定文本分析的准确性。
ChatGPT 显示出人口统计方面的不公平性，女性在某些特质上被预测为较高的比例更高，年龄较大的人在开放性上预测为较低的比例更高。
所引出的性格特征可以提升对下游任务如情感分类和压力预测的表现。
在 PAN 数据集上，ChatGPT CoT_S（句子级）和 CoT_D（文档级）在某些特质上带来显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。