QUICK REVIEW

[論文レビュー] Addressing LLM Diversity by Infusing Random Concepts

Pulin Agrawal, Prasoon Goyal|arXiv (Cornell University)|Jan 26, 2026

Topic Modeling被引用数 0

ひとこと要約

{翻訳済み} tldr: The paper tests a lightweight prompting method that prepends random words or sentences to prompts to increase diversity (unique items and entropy) in LLM list-generation tasks across multiple models.

ABSTRACT

Large language models (LLMs) are known to produce outputs with limited diversity. In this work, we study whether infusing random concepts in the prompts can improve the diversity of the generated outputs. To benchmark the approach, we design a systematic evaluation protocol which involves prompting an LLM with questions of the form "Name 10 Hollywood actors", and analyzing diversity measures of the resulting LLM outputs. Our experiments on multiple LLMs show that prepending random words/sentences unrelated to the prompt result in greater diversity in the outputs of LLMs. We believe that this promising result and the evaluation protocol opens up interesting avenues for future work, such as how infusing randomness into LLMs could be applied to other domains. Further, the evaluation protocol could also inspire research into benchmarking LLM diversity more systematically.

研究の動機と目的

研究の目的・動機を3-5点で
Investigate the extent to which infusing random concepts in prompts increases LLM output diversity.
Develop a lightweight, prompt-based method to mitigate the long-tail/ mode collapse in LLMs.
Provide an evaluation protocol for measuring diversity in list-based prompts across models and datasets.

提案手法

Define prompts asking for a list of K items from a fuzzy set and query the model M times per prompt.
Prepend a random word or random sentence to prompts to create random-context prompts.
Measure diversity via count of unique items and entropy of the output distribution.
Compare Regular Prompts, Random Word, and Random Sentence across ordered and unordered prompt settings.
Analyze changes in distribution shape and statistical significance across models.

実験結果

リサーチクエスチョン

RQ1Does prepending random concepts to prompts increase the number of unique items generated by LLMs for list-based tasks?
RQ2Does the approach increase the entropy of the output distribution, indicating reduced bias toward common responses?
RQ3Is diversity improvement consistent across different model sizes and between ordered/unordered prompt settings?
RQ4How does the method scale with more random context (e.g., more random words)?

主な発見

Model	Dataset	Regular Prompt (Entropy)	Regular Prompt (Count)	With Random Word (Entropy)	With Random Word (Count)	With Random Sentence (Entropy)	With Random Sentence (Count)
Gemma3:4b	ordered	3.89	22	4.09	32	4.08	30
Gemma3:4b	unordered	3.85	18	4.16	35	4.16	31
Nova Pro	ordered	4.15	31	4.24	34	4.22	32
Nova Pro	unordered	4.13	31	4.25	33	4.21	33
Claude 3.5 Sonnet	ordered	3.65	16	3.70	17	3.68	16
Claude 3.5 Sonnet	unordered	3.65	16	3.69	18	3.66	17
Mistral Large	ordered	4.03	22	4.15	26	4.16	26
Mistral Large	unordered	3.93	21	4.08	26	4.14	28

Across models (Gemma3 4b, Nova Pro, Claude 3.5 Sonnet, Mistral Large) and both ordered/unordered settings, random-context prompts yield more unique responses.
Entropy of outputs increases with random-context prompts, indicating a flatter distribution and reduced long-tail bias.
Random words and random sentences generally improve diversity with statistically significant differences (p < 0.05) compared to regular prompts.
Increasing the number of random words shows mixed impact on entropy, suggesting potential saturation in the randomness effect.
The approach is orthogonal to temperature settings and can be combined with other diversity-enhancing methods.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。