QUICK REVIEW

[Paper Review] Automatic Sarcasm Detection: A Survey

Aditya Joshi, Pushpak Bhattacharyya|arXiv (Cornell University)|Feb 10, 2016

Identification and Quantification in Food42 references109 citations

TL;DR

This survey compiles past work on automatic sarcasm detection, covering problem definitions, datasets, approaches (rule-based, statistical, deep learning), trends (pattern discovery and context usage), and open issues.

ABSTRACT

Automatic sarcasm detection is the task of predicting sarcasm in text. This is a crucial step to sentiment analysis, considering prevalence and challenges of sarcasm in sentiment-bearing text. Beginning with an approach that used speech-based features, sarcasm detection has witnessed great interest from the sentiment analysis community. This paper is the first known compilation of past work in automatic sarcasm detection. We observe three milestones in the research so far: semi-supervised pattern extraction to identify implicit sentiment, use of hashtag-based supervision, and use of context beyond target text. In this paper, we describe datasets, approaches, trends and issues in sarcasm detection. We also discuss representative performance values, shared tasks and pointers to future work, as given in prior works. In terms of resources that could be useful for understanding state-of-the-art, the survey presents several useful illustrations - most prominently, a table that summarizes past papers along different dimensions such as features, annotation techniques, data forms, etc.

Motivation & Objective

Summarize the aims and motivations behind automatic sarcasm detection research.
Catalog datasets, problem formulations, and annotation approaches used in sarcasm detection.
Review methodological approaches (rule-based, statistical, and deep learning) and their features.
Identify major trends (pattern discovery, hashtag supervision, context incorporation) and prevailing issues.
Provide guidance on future directions and resources for state-of-the-art sarcasm detection research.

Proposed method

Conduct a comprehensive literature review of sarcasm detection research from datasets to approaches.
Classify approaches into rule-based, statistical, and deep learning-based categories.
Discuss pattern discovery techniques for sarcasm indicators and their use as features or rules.
Examine the role of contextual information beyond the target text (author, conversation, topical context).
Illustrate resources via a consolidating table of past papers across dimensions (features, annotation, data forms).
Summarize reported performance and shared tasks to situate state-of-the-art.

Experimental results

Research questions

RQ1What datasets (short text, long text, other) have been used for sarcasm detection and how are they labeled?
RQ2What features and learning algorithms have proven effective for sarcasm detection across data forms?
RQ3How has contextual information beyond the target text been incorporated and what impact does it have?
RQ4What trends and issues have emerged in sarcasm detection, including data labeling and annotation reliability?
RQ5What shared tasks exist and what do they reveal about the state of the field?

Key findings

Tweets are the predominant data form for sarcasm detection, with long text and other datasets also explored.
Hashtag-based supervision has been widely used to label sarcastic content, though quality concerns exist and validation across datasets is common.
Context beyond the target text—such as author history, conversation context, and topical context—has emerged as a key trend.
A progression from rule-based to supervised/semi-supervised methods, with pattern discovery as a core technique, marks early work; recent studies emphasize contextual information.
A range of features (unigrams, sentiment lexicons, patterns, semantic relatedness, and even eye-tracking derived features) and classifiers (SVM, Naive Bayes, logistic regression, sequence models) have been explored, with varying performance depending on data and task.
Deep learning approaches are beginning to appear, leveraging word embeddings, user embeddings, and hybrid architectures.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.