QUICK REVIEW

[논문 리뷰] Specialized Foundation Models Struggle to Beat Supervised Baselines

Zongzhe Xu, Ritvik Gupta|arXiv (Cornell University)|2024. 11. 05.

Grouting, Rheology, and Soil Mechanics인용 수 5

한 줄 요약

이 논문은 게놈학(genomics), 위성 영상(satellite imaging), 및 시계열(time series)에서 잘 조정된 지도 학습 모델이 종종 전문화된 foundation models(FMs)과 일치하거나 이를 능가하며, FMs의 대규모 프리트레이닝 데이터에도 불구하고 그렇다.

ABSTRACT

Following its success for vision and text, the "foundation model" (FM) paradigm -- pretraining large models on massive data, then fine-tuning on target tasks -- has rapidly expanded to domains in the sciences, engineering, healthcare, and beyond. Has this achieved what the original FMs accomplished, i.e. the supplanting of traditional supervised learning in their domains? To answer we look at three modalities -- genomics, satellite imaging, and time series -- with multiple recent FMs and compare them to a standard supervised learning workflow: model development, hyperparameter tuning, and training, all using only data from the target task. Across these three specialized domains, we find that it is consistently possible to train simple supervised models -- no more complicated than a lightly modified wide ResNet or UNet -- that match or even outperform the latest foundation models. Our work demonstrates that the benefits of large-scale pretraining have yet to be realized in many specialized areas, reinforces the need to compare new FMs to strong, well-tuned baselines, and introduces two new, easy-to-use, open-source, and automated workflows for doing so.

연구 동기 및 목표

전문화된 foundation models(FMs)가 도메인 특화 작업에서 전통적인 지도 학습을 능가하는지 평가한다.
Target-domain 데이터만을 사용하여 데이터- 및 작업 제한된 감독 파이프라인과 FM 기반 전이 학습 워크플로를 비교한다.
여러 작업 및 도메인에 걸쳐 강력한 감독 모델을 공정하고 효율적으로 훈련시키기 위한 자동화 파이프라인을 개발한다.
강건하고 도메인 인지적 베이스라인과 효율적이고 확장 가능한 AutoML 방식의 중요성을 입증한다.

제안 방법

FM 워크플로(대규모 도메인 데이터로의 사전 학습 후 미세 조정)와 대상 작업 데이터만을 사용하는 감독 워크플로를 비교한다.
DASH를 사용하여 커널 크기와 확장 계수(dilation rates)를 조정해 CNN 백본을 자동으로 적응시킨다(아키텍처 검색).
발견된 아키텍처에 대한 교육 스케줄 구성을 위해 ASHA를 사용한다.
시계열의 경우, GPU에서 lookback, 차분(diffing), 및 AR 구성요소를 조정하는 간단한 Auto-AR 워크플로를 도입한다.

실험 결과

연구 질문

RQ1강력한 태스크-전용 베이스라인에 대해 평가했을 때, 전문화된 FMs가 게놈학, 위성 영상, 시계열 작업에서 전통적 감독 학습보다 뛰어난가?
RQ2자동화된 감독 학습 파이프라인이 FM 성능에 맞추거나 능가하면서도 훨씬 적은 데이터와 파라미터를 사용할 수 있는가?
RQ3아키텍처 튜닝(커널 크기, dilation)과 간단한 베이스라인(AR)이 FM 이점을 줄이는 데 어떤 역할을 하는가?
RQ4데이터 규모 및 모델 크기에 따라 도메인 간 FM 결과가 어떻게 달라지는가?

주요 결과

게놈학에서, NAS-튜닝된 CNN 워크플로인 DASHA는 NT 벤치마크에서 최첨단 성능을 달성하고, 사전 학습 데이터 없이도 종종 FMs를 능가한다.
위성 영상에서, DASHA는 파라미터 수가 훨씬 적고 사전 학습 없이도 최상위 FMs에 버금가거나 경쟁력이 있다.
시계열에서, Auto-AR은 일곱 개 작업에서 경쟁력 있는 성능을 달성하며, 종종 여러 오픈소스 FMs를 능가하고 중앙값 개선에서 Auto-ARIMA를 능가한다.
전반적으로, 간단한 감독 모델들(e.g., wide ResNet, UNet, AR)은 도메인 전반에서 전문화된 FMs에 자주 비견되거나 이를 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.