QUICK REVIEW

[논문 리뷰] PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

Mehdi Lotfian, Mohammad Jalali|arXiv (Cornell University)|2026. 02. 03.

Generative Adversarial Networks and Image Synthesis인용 수 0

한 줄 요약

PromptSplit은 프롬프트를 출력과 결합한 공동 커널에서 프롬프트 의존적 모델 간의 불일치를 탐지하고 분석하는 커널 기반 프롬프트 인식 프레임워크로, 스펙트럴 분석을 통해 이들의 공분산 차이를 수행합니다. 이는 이론적 보장을 갖춘 확장 가능한 무작위 투사 방법을 제공하고 텍스트-이미지, 텍스트-텍스트, 이미지-캡션 생성 작업 전체에서 프롬프트 수준의 불일치를 탐지함을 시연합니다.

ABSTRACT

Prompt-guided generative AI models have rapidly expanded across vision and language domains, producing realistic and diverse outputs from textual inputs. The growing variety of such models, trained with different data and architectures, calls for principled methods to identify which types of prompts lead to distinct model behaviors. In this work, we propose PromptSplit, a kernel-based framework for detecting and analyzing prompt-dependent disagreement between generative models. For each compared model pair, PromptSplit constructs a joint prompt--output representation by forming tensor-product embeddings of the prompt and image (or text) features, and then computes the corresponding kernel covariance matrix. We utilize the eigenspace of the weighted difference between these matrices to identify the main directions of behavioral difference across prompts. To ensure scalability, we employ a random-projection approximation that reduces computational complexity to $O(nr^2 + r^3)$ for projection dimension $r$. We further provide a theoretical analysis showing that this approximation yields an eigenstructure estimate whose expected deviation from the full-dimensional result is bounded by $O(1/r^2)$. Experiments across text-to-image, text-to-text, and image-captioning settings demonstrate that PromptSplit accurately detects ground-truth behavioral differences and isolates the prompts responsible, offering an interpretable tool for detecting where generative models disagree.

연구 동기 및 목표

aggregate 품질 지표를 넘는 프롬프트 주도 생성 모델 간 비교의 필요성을 동기화합니다.
모델 간 불일치를 유발하는 프롬프트 범주를 식별하기 위한 프롬프트 인식 스펙트럼 프레임워크를 개발합니다.
다양한 프롬프트–응답 모드의 차이를 강조하는 공동 프롬프트–출력 커널 공분산 차이를 제안합니다.
대규모 데이터셋에서 고유 공간을 추정하기 위한 이론적 보장을 갖춘 확장 가능한 기법을 제공합니다.

제안 방법

프롬프트와 출력의 텐서-곱 임베딩을 구성하여 공동 특징 맵을 형성합니다.
공분산 차이 연산자 bc37X,Y|T = C_{Totimes X} - beta C_{Totimes Y}를 정의하고 불일치 방향을 식별합니다.
상위 고유 쌍이 공분산 차이의 고유 벡터에 대응하도록 K_{X,Y|T}라는 블록 커널 행렬을 형성합니다.
차원 축소를 위해 랜덤 프로젝션(RP)을 적용하고 비용을 O((m+n) r^2 + r^3)로 달성하며 고유 공간 오차에 대한 O(1/r^2) 경계 를 가집니다.
RP 방법의 고유 구조 보존 정확성을 보이는 이론적 결과(정리 1)를 제공합니다.
실제-모델 실험과 정답-실험(ground-truth) 및 텍스트-이미지, 텍스트-텍스트 설정에서의 적용 가능성을 시연합니다.

Figure 1 : Overview of PromptSplit for discovering different types of (prompt,answer) between two models. (a) From NQ-Open questions, we generate outputs from the test model (Qwen3) and reference model (Gemma3). (b) Two high-scoring modes found by PromptSplit: for each mode we show representative pr

실험 결과

연구 질문

RQ1매치된 프롬프트 분포를 가정할 때 프롬프트 조건부 스펙트럴 차이가 두 생성 모델이 어디에서 불일치하는지 드러낼 수 있습니까?
RQ2모델 간 프롬프트-출력 공간에서 프롬프트 유발 행동 차이를 포착하는 주요 방향(고유 벡터)은 무엇입니까?
RQ3대규모 프롬프트–출력 데이터 세트에서 관련 고유 공간 구조를 보존하면서 랜덤 프로젝션이 분석 가능성을 어떻게 유지합니까?
RQ4실제-모델 설정(text-to-image 및 text-to-text)과 합성 정답 시나리오에서 PromptSplit의 성능은 어떠합니까?
RQ5PromptSplit을 사용하여 프롬프트 의존적 불일서를 활용해 참조 분포에 맞추도록 확산 모델을 가이드할 수 있습니까?

주요 결과

PromptSplit은 합성 및 실제 환경에서 모델 쌍 간 출력이 달라지게 하는 프롬프트 클러스터를 정확하게 식별합니다.
공분산 차이 연산자의 상위 고유 벡터는 불일치를 이끄는 주요 프롬프트 범주에 대응합니다.
랜덤 프로젝션은 전체 차원 결과와의 고유 공간 편차에 대해 O(1/r^2) 경계를 제공하여 확장 가능한 분석을 가능하게 합니다.
PromptSplit은 텍스트-이미지 모델에서 스타일, 구성, 정렬 등 불일치가 있는 프롬프트 군을 밝혀냅니다.
LLM에 적용 시 QA 모델 간 NQ-Open 질문과 같은 주제에서 프롬프트 의존적 불일치 클러스터를 나타냅니다.
PromptSplit 가이드라인은 불일치 기반 항을 가이던스 목표에 포함시켜 확산 모델의 분포 정합성을 개선할 수 있습니다.

Figure 2 : PromptSplit identified top clusters of prompts with distinct images. Top: 20 clusters of prompts with sample images for test and reference dataset. Bottom: Top 16 eigenvalues showing top 10 disagreement causing prompts.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.