QUICK REVIEW

[论文解读] When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models

Satyam Kumar Navneet, Joydeep Chandra|arXiv (Cornell University)|Feb 25, 2026

Computational and Text Analysis Methods被引用 0

一句话总结

论文量化了大型语言模型在改写过程中如何抹去来自世界英语的文化标记特征，揭示“语义保留悖论”，即意义得到保留但文化声音被削弱，并展示提示可以缓解这种抹去。

ABSTRACT

Large Language Models (LLMs) are increasingly used to ``professionalize'' workplace communication, often at the cost of linguistic identity. We introduce "Cultural Ghosting", the systematic erasure of linguistic markers unique to non-native English varieties during text processing. Through analysis of 22,350 LLM outputs generated from 1,490 culturally marked texts (Indian, Singaporean,& Nigerian English) processed by five models under three prompt conditions, we quantify this phenomenon using two novel metrics: Identity Erasure Rate (IER) & Semantic Preservation Score (SPS). Across all prompts, we find an overall IER of 10.26%, with model-level variation from 3.5% to 20.5% (5.9x range). Crucially, we identify a Semantic Preservation Paradox: models maintain high semantic similarity (mean SPS = 0.748) while systematically erasing cultural markers. Pragmatic markers (politeness conventions) are 1.9x more vulnerable than lexical markers (71.5% vs. 37.1% erasure). Our experiments demonstrate that explicit cultural-preservation prompts reduce erasure by 29% without sacrificing semantic quality.

研究动机与目标

将AI辅助写作中的文化鬼影概念正式化。
为大规模分析引入身份抹除率（IER）与语义保留分数（SPS）等指标。
在多种世界英语变体和提示条件下评估开源大型语言模型。
量化标记类型（词汇、语用、句法）的易受抹除程度差异。
识别简单与算法化的缓解措施，以在不牺牲语义的前提下保留文化声音。

提出的方法

构建一个由印度英语、新加坡英语、尼日利亚英语和美式英语组成的1490条带文化标记的文本语料库。
在词汇、语用和句法类别中标注108个文化标记。
在三种提示条件（基线、中性、保留）下处理来自五个开源LLM的22350条输出。
为每个输出计算身份抹除率（IER）和语义保留分数（SPS）。
用标注数据和LLM判断的高度一致性来验证代理指标。
分析模型间及标记类别的方差；测试包括保留提示与算法技术在内的缓解策略。

Figure 1 . Conceptual illustration of cultural ghosting: LLM-based writing assistants transform culturally marked expressions (e.g., "Kindly…", "Lah!", "Respected Sir…") into semantically preserved but culturally flattened outputs (e.g., "Please…", "Hello", "Respond"), demonstrating how meaning is r

实验结果

研究问题

RQ1LLMs在多大程度上抹去带有文化标记的特征，且不同模型的抹除程度有何差异？
RQ2是否某些标记类别（词汇、语用、句法）对抹除更为脆弱？
RQ3明确的文化保留提示是否能在不牺牲语义质量的前提下降低抹除？
RQ4像受限解码、再排序等算法缓解方法是否能实现可扩展的文化声音保留？

主要发现

所有输出的平均身份抹除率（IER）为0.1026，模型之间差异显著（3.5%到20.5%）。
平均语义保留分数（SPS）为0.7482，尽管存在标记抹除，语义保真度仍然较高。
语用标记的抹除率最高，为71.5%，其次为句法56.3%、词汇37.1%。
明确的保留提示将IER降低29%，且不牺牲SPS；受限解码实现了47%的IER下降。
模型对齐策略驱动抹除差异，而非模型规模（在模型之间约有5.9倍的差异）。
如标记感知解码和对比再排序等算法缓解方法在降低IER的同时保持SPS显示出潜力。

Figure 2 . End-to-end experimental pipeline for measuring cultural ghosting. The workflow progresses from dataset construction ( 1,490 texts ) through cultural marker annotation ( 108 markers ), LLM processing, forensic computation of Identity Erasure Rate (IER) & Semantic Preservation Score (SPS),

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。