QUICK REVIEW

[논문 리뷰] A deep tree-based model for software defect prediction

Hoa Khanh Dam, Trang Pham|arXiv (Cornell University)|2018. 02. 03.

Software Engineering Research참고 문헌 30인용 수 100

한 줄 요약

이 논문은 defect prediction을 위해 Abstract Syntax Trees에서 직접 작동하는 tree-structured LSTM을 도입하여 코드 표현을 학습하며, Samsung 및 PROMISE 데이터 세트에서 내부-프로젝트 및 교차-프로젝트 성능이 강함.

ABSTRACT

Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.

연구 동기 및 목표

대규모 코드베이스에서 테스트 및 유지보수의 우선순위를 정하는 데 필수적인 결함 예측을 동기화한다.
AST를 소스 코드에 맞추어 구문과 의미를 보존하는 깊은 트리 구조 LSTM을 제안한다.
AST로부터 표현을 학습하여 수동 특성 공학을 제거한다.
실제 오픈 소스 Samsung 프로젝트와 PROMISE 저장소에서 내부 및 교차 프로젝트 예측 모두에서 접근을 평가한다.

제안 방법

소스 파일을 Abstract Syntax Trees (ASTs)로 파싱한다.
임베딩 매트릭스(ast2vec)를 통해 AST 노드 레이블을 고정 크기 벡터로 임베딩한다.
트리 구조 LSTM(Tree-LSTM)을 적용하여 자식 표현들을 모아 루트 코드 표현을 생성한다.
부모 노드 레이블을 자식으로부터 예측하여 unsupervised 방식으로 Tree-LSTM을 학습한다(부모 레이블에 대한 소프트맥스).
학습된 루트 표현을 입력으로 하여 로지스틱 회귀나 랜덤 포레스트와 같은 기존 분류기에서 결함 예측에 사용한다.

실험 결과

연구 질문

RQ1AST들에 대한 tree-structured LSTM이 결함 예측을 위해 구문 및 의미 정보를 효과적으로 포착할 수 있는가?
RQ2AST 기반 표현 학습이 기존 피처 기반 접근법에 비해 내부 및 교차 프로젝트 결함 예측을 개선하는가?
RQ3데이터셋 간 예측 성능에 미치는 분류기 선택(Logistic Regression vs Random Forest)의 영향은 무엇인가?

주요 결과

Samsung 데이터세트 내에서 Tree-LSTM 특징을 가진 Random Forest는 F-measure, Precision, Recall, AUC가 모두 0.9를 넘었고 AUC는 약 0.98에 이른다.
PROMISE 데이터세트 내에서 이 접근법은 평균 AUC 0.60과 높은 Recall 0.86을 얻었으나, Precision과 F-measure는 일부 베이스라인보다 낮다.
교차 프로젝트 예측은 22개 프로젝트 쌍에서 평균 Recall이 높게 나타났고(AUC는 0.5 이상을 지속적으로 유지), 전체적으로 효과적임을 시사한다.
본 연구는 모델이 원시 AST들로부터 학습할 수 있음을 보여주며, 수동 특성 공학 없이 결함 예측을 가능하게 하고 주의(attention)로 잠재적 위치 지정을 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.