QUICK REVIEW

[Paper Review] Exemplar-Based Word Sense Disambiguation: Some Recent Improvements

Hwee Tou Ng|ArXiv.org|Jun 10, 1997

Natural Language Processing Techniques23 references67 citations

TL;DR

This paper improves exemplar-based word sense disambiguation by using 10-fold cross validation to automatically select the optimal number of nearest neighbors ($k$), significantly boosting accuracy. The resulting classifier achieves performance comparable to the Naive-Bayes algorithm—previously reported as the highest-performing among seven state-of-the-art methods—demonstrating that exemplar-based learning is highly effective for word sense disambiguation when properly tuned.

ABSTRACT

In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of $k$, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best $k$, we have obtained improved disambiguation accuracy on a large sense-tagged corpus first used in \cite{ng96}. The accuracy achieved by our improved exemplar-based classifier is comparable to the accuracy on the same data set obtained by the Naive-Bayes algorithm, which was reported in \cite{mooney96} to have the highest disambiguation accuracy among seven state-of-the-art machine learning algorithms.

Motivation & Objective

To improve the accuracy of exemplar-based word sense disambiguation by optimizing the number of nearest neighbors ($k$).
To evaluate whether exemplar-based learning can match or exceed the performance of the Naive-Bayes algorithm, previously reported as the best-performing method on the same corpus.
To investigate the impact of $k$ on classifier performance, especially in cases where $k=1$ underperforms.
To demonstrate that automated hyperparameter selection via cross validation can significantly enhance exemplar-based learning in WSD.

Proposed method

The exemplar-based learning algorithm Pebls is used, which computes distance between examples using a value difference metric based on class-conditional probabilities of feature values.
Distance between two examples is computed as the sum of feature-wise distances, where each feature's distance is the sum of absolute differences in class-conditional probabilities.
The $k$-nearest neighbors are selected based on minimum distance, and the majority class among them is assigned to the test example.
A 10-fold cross validation procedure is applied to the training set to automatically determine the optimal $k$ value that minimizes error rate.
The performance of the optimized Pebls classifier is compared against the Naive-Bayes algorithm on a large, sense-tagged corpus from Ng and Lee (1996).
Feature pruning is avoided to preserve potentially useful collocation features, as prior pruning was found to reduce accuracy.

Experimental results

Research questions

RQ1Can increasing the number of nearest neighbors ($k$) in an exemplar-based classifier improve word sense disambiguation accuracy?
RQ2Does 10-fold cross validation for $k$ selection lead to better performance than fixed $k$ values, such as $k=1$?
RQ3Can an exemplar-based approach achieve accuracy comparable to the Naive-Bayes algorithm, which was previously reported as the top-performing method on the same dataset?
RQ4Why does the exemplar-based method require larger $k$ values when it fails to outperform the most frequent class baseline?
RQ5How does the choice of distance metric in Pebls compare to the Hamming distance used in other nearest-neighbor WSD systems?

Key findings

Using $k=20$ in the exemplar-based classifier Pebls achieves disambiguation accuracy comparable to the Naive-Bayes algorithm on the same corpus.
The 10-fold cross validation procedure for selecting $k$ yields a performance slightly higher than the Naive-Bayes algorithm, indicating that automatic hyperparameter tuning enhances exemplar-based learning.
For 13 out of 191 words, the best $k$ value found via cross validation was 85 or higher, indicating that the method defaults to a majority-class-like behavior when it struggles to outperform baseline.
The performance of Pebls with $k=1$ is significantly worse than Naive-Bayes, but increasing $k$ to 20 closes the performance gap substantially.
The study shows that feature pruning, as used in prior work, can be detrimental, as it removes useful collocation features that improve accuracy.
The results confirm that exemplar-based learning is a viable and competitive approach for word sense disambiguation when properly tuned, especially with cross-validated $k$ selection.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.