QUICK REVIEW

[논문 리뷰] ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations

Rishabh Tiwari, Udbhav Bamba|arXiv (Cornell University)|2021. 02. 14.

Advanced Neural Network Applications참고 문헌 35인용 수 31

한 줄 요약

ChipNet은 연속 히비사이드 투영과 크리스프니스 손실을 활용하여 사전에 학습된 조밀한 모델로부터 매우 희소한 가지치기 네트워크를 얻는 결정적이고 예산 인식적인 구조적 가지치기 방법을 제시하며, 여러 예산 및 데이터셋에서 최첨단 기준선들을 능가합니다.

ABSTRACT

Structured pruning methods are among the effective strategies for extracting small resource-efficient convolutional neural networks from their dense counterparts with minimal loss in accuracy. However, most existing methods still suffer from one or more limitations, that include 1) the need for training the dense model from scratch with pruning-related parameters embedded in the architecture, 2) requiring model-specific hyperparameter settings, 3) inability to include budget-related constraint in the training process, and 4) instability under scenarios of extreme pruning. In this paper, we present ChipNet, a deterministic pruning strategy that employs continuous Heaviside function and a novel crispness loss to identify a highly sparse network out of an existing dense network. Our choice of continuous Heaviside function is inspired by the field of design optimization, where the material distribution task is posed as a continuous optimization problem, but only discrete values (0 or 1) are practically feasible and expected as final outcomes. Our approach's flexible design facilitates its use with different choices of budget constraints while maintaining stability for very low target budgets. Experimental results show that ChipNet outperforms state-of-the-art structured pruning methods by remarkable margins of up to 16.1% in terms of accuracy. Further, we show that the masks obtained with ChipNet are transferable across datasets. For certain cases, it was observed that masks transferred from a model trained on feature-rich teacher dataset provide better performance on the student dataset than those obtained by directly pruning on the student data itself.

연구 동기 및 목표

처음부터 재학습 없이도 임의의 예산 제약을 적용할 수 있는 강력한 구조적 가지치기를 촉진한다.
그래디언트 기반 최적화를 사용하여 조밀한 네트워크에서 거의 이산에 가까운 채널 마스크를 산출하는 가지치기 메커니즘을 개발한다.
매우 낮은 예산 상황에서도 안정성을 가능하게 하고 학습된 마스크의 데이터셋 간 전이 가능성을 입증한다.
ChipNet이 다양한 예산 하에서 기존 가지치기 방법보다 우수한 정확도를 달성함을 보인다.

제안 방법

사전 학습된 조밀한 CNN의 채널에 대한 희소성 마스크를 학습한다.
연속적인 Heaviside 투영과 로지스틱 프록시를 결합하여 마스크를 0 또는 1로 향하게 한다.
중간 마스크 값을 벌점하는 크리스프니스 손실을 도입하여 이산적인 마스크를 촉진한다.
채널, 활성화-볼륨, 매개변수, 또는 FLOPs 예산을 수용할 수 있는 예산 손실을 통해 예산 제약을 부과한다.
교차 엔트로피, 크리스프니스, 예산 항을 포함하는 공동 손실로 학습하고, 먼저 소프트 가지치기를 수행한 다음 이진 마스킹을 통해 하드 가지치기를 수행한다.
예산 함수 V를 여러 옵션(채널, 볼륨, 매개변수, FLOPs) 중 하나로 허용하고, 예산 편성 중 마스크의 로지스틱 투영 버전에서 추정치를 계산한다.

실험 결과

연구 질문

RQ1결정적이고 예산 인식적인 가지치기 방법이 처음부터 재학습 없이도 매우 희소한 네트워크를 만들어낼 수 있는가?
RQ2연속적인 Heaviside 투영과 크리스프니스 손실이 다양한 예산 제약을 만족하는 거의 이산적인 마스크를 산출하는가?
RQ3학습된 가지치기 마스크가 데이터셋 및 작업 도메인 간에 전이 가능한가?
RQ4다양한 예산 형식에서 ChipNet의 성능이 최첨단 예산 인식 가지치기 방법과 비교하여 어떤가?
RQ5매우 낮은 리소스 예산(예: 극단적 가지치기)에서도 ChipNet이 안정적이고 효과적인가?

주요 결과

ChipNet은 다른 예산 유형과 데이터셋에 걸쳐 최첨단 가지치기 기준선보다 일관되게 우수하게 성능을 발휘한다.
이 방법은 매우 낮은 예산에서도 안정적이고 효과적이며, 극단적 가지치기 시나리오를 포함한다.
ChipNet으로 학습된 마스크는 데이터셋 간 전이가 가능하며, 때로는 대상 데이터에서 직접 학습된 마스크보다 성능이 좋다.
여러 실험에서 ChipNet은 유사한 예산 하에서 기준선 대비 상당한 마진의 정확도 향상을 달성했다.
특징이 풍부한 교사 데이터셋에서 마스크를 전이하면 학생 데이터 자체를 가지치기하는 것보다 더 나은 학생 성능을 제공할 수 있다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.