QUICK REVIEW

[논문 리뷰] Path-Level Network Transformation for Efficient Architecture Search

Han Cai, Jiacheng Yang|arXiv (Cornell University)|2018. 06. 07.

Advanced Neural Network Applications참고 문헌 32인용 수 118

한 줄 요약

경로 수준의 기능 보존 네트워크 변환을 도입하여 신경망의 토폴로지 변화를 가능하게 하고, 양방향 트리 구조 RL 메타 컨트롤러와 함께 표현력 있는 트리 구조 아키텍처를 탐색하여, 제한된 컴퓨트로 CIFAR-10 및 ImageNet Mobile에서 강력한 성능을 달성한다.

ABSTRACT

We introduce a new function-preserving transformation for efficient neural architecture search. This network transformation allows reusing previously trained networks and existing successful architectures that improves sample efficiency. We aim to address the limitation of current network transformation operations that can only perform layer-level architecture modifications, such as adding (pruning) filters or inserting (removing) a layer, which fails to change the topology of connection paths. Our proposed path-level transformation operations enable the meta-controller to modify the path topology of the given network while keeping the merits of reusing weights, and thus allow efficiently designing effective structures with complex path topologies like Inception models. We further propose a bidirectional tree-structured reinforcement learning meta-controller to explore a simple yet highly expressive tree-structured architecture space that can be viewed as a generalization of multi-branch architectures. We experimented on the image classification datasets with limited computational resources (about 200 GPU-hours), where we observed improved parameter efficiency and better test results (97.70% test accuracy on CIFAR-10 with 14.3M parameters and 74.6% top-1 accuracy on ImageNet in the mobile setting), demonstrating the effectiveness and transferability of our designed architectures.

연구 동기 및 목표

기능을 보존하면서 경로 토폴로지를 수정하여 레이어 수준 편집을 넘는 아키텍처 탐색을 동기부여하고 가능하게 한다.
가중치 재사용과 Inception 모델과 같은 복잡한 경로 토폴로지를 탐색할 수 있는 경로 수준 변환 연산을 제안한다.
트리 구조의 아키텍처 공간과 이를 탐색하기 위한 양방향 트리-LSTM 기반 RL 메타 컨트롤러를 정의한다.
제한된 GPU-시간 하에서 CIFAR-10에 대한 샘플 효율적인 탐색을 시연하고 ImageNet 모바일 설정으로의 이전 가능성을 보여준다.

제안 방법

단일 레이어를 다중 분기 모티프로 대체하고 전체 기능을 보존하는 경로 수준 네트워크 변환 연산 정의.
분기 내에서 Net2Net 스타일의 더 깊고 넓은 변환을 사용하여 다양한 경로 토폴로지를 생성.
할당(복제/분할) 및 병합(더하기/연결) 스킴으로 트리 구조의 아키텍처 공간을 구성.
바닥에서 위로 Tree-LSTM과 위에서 아래로 Tree-LSTM을 갖춘 양방향 트리 구조 Reinforcement Learning 메타 컨트롤러를 사용하여 아키텍처를 변환으로 매핑.
검증 정확도에서 파생된 보상과 분산 감소를 위한 작은 기준선을 사용하여 REINFORCE로 메타 컨트롤러를 학습.

실험 결과

연구 질문

RQ1경로 수준의 기능 보존 변환이 레이어 수준 편집보다 더 풍부한 경로 토폴로지를 가진 네트워크를 효율적으로 탐색하게 할 수 있는가?
RQ2트리 구조 RL 컨트롤러가 체인/평면 인코딩에 비해 탐색 효율 및 발견된 아키텍처를 향상시키는가?
RQ3학습된 트리 구조 셀이 더 큰 기본 네트워크 및 모바일 설정의 ImageNet으로의 전달 가능성은 어느 정도인가?

주요 결과

DenseNet 기반 베이스(3.2M 매개변수)로 CIFAR-10에서 테스트 오차 3.64%를 달성한 최적의 셀과 3.14%를 정규화로 달성, 여러 베이스라인보다 훨씬 적은 매개변수로 우수한 성능.
정규화(DropPath/Cutout)를 적용하면 CIFAR-10 최적 셀은 14.3M 매개변수에서 테스트 오차 2.30%, 5.7M 매개변수에서 2.49%에 도달.
CIFAR-10에서 TreeCell-A를 DenseNet/PyramidNet에 내장하면, 수작업 설계 또는 선행 NAS 모델보다 훨씬 적은 매개변수로 경쟁력 있는/테스트된 결과를 얻음.
ImageNet Mobile 설정에서 CondenseNet과 함께 TreeCell-A는 25.5% top-1, 8.0% top-5를 달성하고 TreeCell-B는 25.4% top-1를 달성, 비슷한 FLOPs를 가진 NASNet-A와 비교할 때 더 적은 컴퓨트(약 200 GPU-시간)로 견줄만하거나 더 낫다.
이 접근법은 학습된 트리 셀의 아키텍처 간 전이 가능성(DenseNet 및 PyramidNet)과 계산 자원이 제한된 데이터셋 간의 전이 가능성을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.