[Paper Review] Exemplar-Based Word Sense Disambiguation: Some Recent Improvements
This paper improves exemplar-based word sense disambiguation by using 10-fold cross validation to automatically select the optimal number of nearest neighbors ($k$), significantly boosting accuracy. The resulting classifier achieves performance comparable to the Naive-Bayes algorithm—previously reported as the highest-performing among seven state-of-the-art methods—demonstrating that exemplar-based learning is highly effective for word sense disambiguation when properly tuned.
In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of $k$, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best $k$, we have obtained improved disambiguation accuracy on a large sense-tagged corpus first used in \cite{ng96}. The accuracy achieved by our improved exemplar-based classifier is comparable to the accuracy on the same data set obtained by the Naive-Bayes algorithm, which was reported in \cite{mooney96} to have the highest disambiguation accuracy among seven state-of-the-art machine learning algorithms.
Motivation & Objective
- To improve the accuracy of exemplar-based word sense disambiguation by optimizing the number of nearest neighbors ($k$).
- To evaluate whether exemplar-based learning can match or exceed the performance of the Naive-Bayes algorithm, previously reported as the best-performing method on the same corpus.
- To investigate the impact of $k$ on classifier performance, especially in cases where $k=1$ underperforms.
- To demonstrate that automated hyperparameter selection via cross validation can significantly enhance exemplar-based learning in WSD.
Proposed method
- The exemplar-based learning algorithm Pebls is used, which computes distance between examples using a value difference metric based on class-conditional probabilities of feature values.
- Distance between two examples is computed as the sum of feature-wise distances, where each feature's distance is the sum of absolute differences in class-conditional probabilities.
- The $k$-nearest neighbors are selected based on minimum distance, and the majority class among them is assigned to the test example.
- A 10-fold cross validation procedure is applied to the training set to automatically determine the optimal $k$ value that minimizes error rate.
- The performance of the optimized Pebls classifier is compared against the Naive-Bayes algorithm on a large, sense-tagged corpus from Ng and Lee (1996).
- Feature pruning is avoided to preserve potentially useful collocation features, as prior pruning was found to reduce accuracy.
Experimental results
Research questions
- RQ1Can increasing the number of nearest neighbors ($k$) in an exemplar-based classifier improve word sense disambiguation accuracy?
- RQ2Does 10-fold cross validation for $k$ selection lead to better performance than fixed $k$ values, such as $k=1$?
- RQ3Can an exemplar-based approach achieve accuracy comparable to the Naive-Bayes algorithm, which was previously reported as the top-performing method on the same dataset?
- RQ4Why does the exemplar-based method require larger $k$ values when it fails to outperform the most frequent class baseline?
- RQ5How does the choice of distance metric in Pebls compare to the Hamming distance used in other nearest-neighbor WSD systems?
Key findings
- Using $k=20$ in the exemplar-based classifier Pebls achieves disambiguation accuracy comparable to the Naive-Bayes algorithm on the same corpus.
- The 10-fold cross validation procedure for selecting $k$ yields a performance slightly higher than the Naive-Bayes algorithm, indicating that automatic hyperparameter tuning enhances exemplar-based learning.
- For 13 out of 191 words, the best $k$ value found via cross validation was 85 or higher, indicating that the method defaults to a majority-class-like behavior when it struggles to outperform baseline.
- The performance of Pebls with $k=1$ is significantly worse than Naive-Bayes, but increasing $k$ to 20 closes the performance gap substantially.
- The study shows that feature pruning, as used in prior work, can be detrimental, as it removes useful collocation features that improve accuracy.
- The results confirm that exemplar-based learning is a viable and competitive approach for word sense disambiguation when properly tuned, especially with cross-validated $k$ selection.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.