QUICK REVIEW

[논문 리뷰] When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models

Satyam Kumar Navneet, Joydeep Chandra|arXiv (Cornell University)|2026. 02. 25.

Computational and Text Analysis Methods인용 수 0

한 줄 요약

이 논문은 LLM이 재작성 중 World Englishes의 문화적으로 표시된 특징들을 지워버리는 정도를 양적으로 측정하고, 의미는 보존되지만 문화적 목소리가 약화되는 의미 보존 패러독스(Semantic Preservation Paradox)를 드러내며, 프롬프트가 이 지우기를 완화할 수 있음을 보인다.

ABSTRACT

Large Language Models (LLMs) are increasingly used to ``professionalize'' workplace communication, often at the cost of linguistic identity. We introduce "Cultural Ghosting", the systematic erasure of linguistic markers unique to non-native English varieties during text processing. Through analysis of 22,350 LLM outputs generated from 1,490 culturally marked texts (Indian, Singaporean,& Nigerian English) processed by five models under three prompt conditions, we quantify this phenomenon using two novel metrics: Identity Erasure Rate (IER) & Semantic Preservation Score (SPS). Across all prompts, we find an overall IER of 10.26%, with model-level variation from 3.5% to 20.5% (5.9x range). Crucially, we identify a Semantic Preservation Paradox: models maintain high semantic similarity (mean SPS = 0.748) while systematically erasing cultural markers. Pragmatic markers (politeness conventions) are 1.9x more vulnerable than lexical markers (71.5% vs. 37.1% erasure). Our experiments demonstrate that explicit cultural-preservation prompts reduce erasure by 29% without sacrificing semantic quality.

연구 동기 및 목표

AI 보조 글쓰기에서 문화적 고스트 현상의 개념을 형식화한다.
대규모 분석을 위한 지표 Identity Erasure Rate (IER)와 Semantic Preservation Score (SPS)를 도입한다.
다양한 World English 변형 및 프롬프트 조건에 걸쳐 오픈 소스 LLM을 평가한다.
어휘적, 화용적, 통사적 표식 유형의 차별적 취약성을 정량화한다.
의미를 해치지 않으면서 문화적 목소리를 보존하기 위한 간단한 및 알고리즘적 완화책을 식별한다.

제안 방법

인도식, 싱가포르식, 나이지리아식, 미국 영어에서 문화적으로 표시된 1,490개 텍스트의 코퍼스를 구성한다.
어휘적, 화용적, 통사적 범주에 걸친 108개의 문화적 표식에 주석을 단다.
Baseline, Neutral, Preservation의 세 가지 프롬프트 조건에서 다섯 개 오픈소스 LLM의 22,350개 출력물을 처리한다.
각 출력에 대해 Identity Erasure Rate (IER)와 Semantic Preservation Score (SPS)를 계산한다.
주석 데이터와 LLM 판단에 대해 높은 일치를 보이는 프록시를 검증한다.
모델 간 및 표식 범주에 따른 분산을 분석하고 보존 프롬프트 및 알고리즘적 기법을 포함한 완화 전략을 검증한다.

Figure 1 . Conceptual illustration of cultural ghosting: LLM-based writing assistants transform culturally marked expressions (e.g., "Kindly…", "Lah!", "Respected Sir…") into semantically preserved but culturally flattened outputs (e.g., "Please…", "Hello", "Respond"), demonstrating how meaning is r

실험 결과

연구 질문

RQ1LLM이 문화적으로 표시된 특징을 어느 정도까지 지워버리며, 지워짐이 모델에 따라 어떻게 달라지는가?
RQ2특정 표식 범주(어휘적, 화용적, 통사적)가 지워짐에 더 취약한가?
RQ3명시적 문화 보존 프롬프트가 의미 품질을 해치지 않으면서 지워짐을 줄일 수 있는가?
RQ4제약된 디코딩, 재랭킹 등 알고리즘적 완화가 문화적 목소리의 확장 가능한 보존을 제공하는가?

주요 결과

전 출력에 걸친 평균 Identity Erasure Rate (IER)은 0.1026이며, 모델 간 변동은 크다(3.5%에서 20.5%까지).
평균 Semantic Preservation Score (SPS)는 0.7482로, 표식 지워짐에도 불구하고 높은 의미적 충실도를 보여준다.
화용적 표식이 71.5%로 가장 높은 지워짐 비율을 보였고, 그다음 통사적 56.3%, 어휘적 37.1%.
명시적 보존 프롬프트가 SPS를 해치지 않으면서 IER을 29% 감소시키고, 제약된 디코딩은 IER을 47% 감소시킨다.
모델 정렬 전략이 지워짐 변동을 좌우하며(모델 크기가 아닌), 모델 간 차이는 5.9배에 이른다.
표식 인식 디코딩 및 대조적 재랭킹과 같은 알고리즘적 완화책은 SPS를 보존하면서 IER 감소에 유망한 결과를 보여준다.

Figure 2 . End-to-end experimental pipeline for measuring cultural ghosting. The workflow progresses from dataset construction ( 1,490 texts ) through cultural marker annotation ( 108 markers ), LLM processing, forensic computation of Identity Erasure Rate (IER) & Semantic Preservation Score (SPS),

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.