QUICK REVIEW

[Paper Review] Naive Bayes and Exemplar-Based approaches to Word Sense Disambiguation Revisited

Gerard Escudero, Lluı́s Màrquez|ArXiv.org|Jul 7, 2000

Natural Language Processing Techniques21 references57 citations

TL;DR

This paper revisits Naive Bayes and exemplar-based learning for word sense disambiguation (WSD), proposing a positive-only representation that improves efficiency without sacrificing accuracy. Exemplar-based methods with the MVDM metric and example weighting significantly outperform Naive Bayes, especially on rich feature sets, with the Positive Exemplar-Based (PEB) approach achieving 68.8% accuracy on a broad-coverage corpus using SetB features.

ABSTRACT

This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar-based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, several directions have been explored, including: testing several modifications of the basic learning algorithms and varying the feature space. Secondly, an improvement of both algorithms is proposed, in order to deal with large attribute sets. This modification, which basically consists in using only the positive information appearing in the examples, allows to improve greatly the efficiency of the methods, with no loss in accuracy. The experiments have been performed on the largest sense-tagged corpus available containing the most frequent and ambiguous English words. Results show that the Exemplar-based approach to WSD is generally superior to the Bayesian approach, especially when a specific metric for dealing with symbolic attributes is used.

Motivation & Objective

To resolve conflicting findings in prior literature comparing Naive Bayes and exemplar-based WSD methods.
To improve computational efficiency of both methods when handling large attribute sets.
To evaluate the impact of feature space richness and metric choice on WSD performance.
To investigate whether supervised learning methods can achieve high accuracy despite the knowledge acquisition bottleneck.
To propose and validate a positive-only representation that enhances efficiency without accuracy loss.

Proposed method

Proposes a positive-only representation that uses only positive attribute values from training examples, discarding negative ones to improve efficiency.
Applies the MVDM (Modified Value Difference Metric) for symbolic attributes in exemplar-based learning to better handle categorical features.
Employs example weighting and attribute weighting in exemplar-based classification to improve accuracy and robustness.
Uses k-Nearest-Neighbour with Hamming distance and MVDM as similarity measures in exemplar-based learning.
Implements Naive Bayes with and without attribute weighting, comparing performance across different feature sets.
Tests all variants on two large sense-tagged corpora: a 15-word subset and a full 191-word corpus with 192,800 examples.

Experimental results

Research questions

RQ1Does the exemplar-based approach outperform Naive Bayes in word sense disambiguation when using richer feature sets and better metrics?
RQ2Can a positive-only representation significantly improve the efficiency of both Naive Bayes and exemplar-based learning without reducing accuracy?
RQ3How does the choice of distance metric (Hamming vs. MVDM) affect the performance of exemplar-based WSD?
RQ4Why do some prior studies report contradictory results between Naive Bayes and exemplar-based methods?
RQ5Is there a computationally feasible trade-off between accuracy and efficiency in large-scale WSD?

Key findings

The exemplar-based approach with the MVDM metric and example weighting significantly outperforms Naive Bayes, achieving 70.2% accuracy on the 15-word subset using SetA.
On the full 191-word corpus, the Positive Exemplar-Based (PEB h,7,e) method achieves 68.8% accuracy with SetB, outperforming Naive Bayes and other variants.
The positive-only representation reduces CPU time by a factor of 80 for Naive Bayes and 15 for exemplar-based learning, making large-scale WSD feasible.
Naive Bayes does not improve accuracy when moving from SetA to SetB, indicating a limitation in handling richer feature sets.
The MVDM metric is more effective than Hamming distance for symbolic attributes, but its computation is too costly for large sets.
The PEB h,7,e variant using SetB, Hamming distance, and example weighting offers the best balance of accuracy and efficiency in realistic settings.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.