Skip to main content
QUICK REVIEW

[論文レビュー] Automatic Bug Triage using Semi-Supervised Text Classification

Jifeng Xuan, He Jiang|Software Engineering and Knowledge Engineering|Apr 16, 2017
Software Engineering Research参考文献 22被引用数 85
ひとこと要約

半教師ありのテキスト分類アプローチは Naive Bayes と expectation-maximization を組み合わせ、ラベル付きとラベルなしのバグレポートの両方を用いてバグ triage を行い、重み付けされた開発者対応のトレーニングと反復的なラベリングで、従来の教師あり手法より精度を向上させる。

ABSTRACT

In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.

研究の動機と目的

  • Address the shortage of labeled bug reports for effective bug triage
  • Develop a semi-supervised learning method that utilizes both labeled and unlabeled bug reports
  • Improve triage accuracy over traditional supervised approaches
  • Incorporate developer influence through weighted training signals
  • Demonstrate effectiveness on real-world bug repositories (Eclipse)

提案手法

  • Combine Naive Bayes classifier with expectation-maximization to leverage unlabeled bug reports
  • Train initial classifier with a fraction of labeled reports
  • Iteratively label unlabeled bug reports and retrain using labels from all reports
  • Incorporate a weighted recommendation list that imposes developer weights during training
  • Evaluate on Eclipse bug reports and compare to existing supervised methods

実験結果

リサーチクエスチョン

  • RQ1Can semi-supervised text classification improve bug triage accuracy with limited labeled data?
  • RQ2How does integrating unlabeled data via EM affect classifier performance in bug triage?
  • RQ3Does incorporating developer-weighted training improve triage results?
  • RQ4How does the proposed method compare to standard supervised approaches on real-world datasets (Eclipse)?

主な発見

  • The semi-supervised approach with EM and NB outperforms existing supervised methods in classification accuracy on Eclipse bug reports.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。