QUICK REVIEW

[論文レビュー] A Survey of Resource-efficient LLM and Multimodal Foundation Models

Mengwei Xu, Wangsong Yin|arXiv (Cornell University)|Jan 16, 2024

Topic Modeling被引用数 32

ひとこと要約

大規模言語モデル、ビジョン・トランスフォーマー、拡散モデル、マルチモーダル基盤モデルを、トレーニング、推論、デプロイメント全体を通じてクラウドからエッジまで、アルゴリズムおよびシステムレベルのアプローチでリソース効率化する総合的な調査。

ABSTRACT

Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment. However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources. To support the growth of these large models in a scalable and environmentally sustainable way, there has been a considerable focus on developing resource-efficient strategies. This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects. It offers a comprehensive analysis and valuable insights gleaned from existing literature, encompassing a broad array of topics from cutting-edge model architectures and training/serving algorithms to practical system designs and implementations. The goal of this survey is to provide an overarching understanding of how current approaches are tackling the resource challenges posed by large foundation models and to potentially inspire future breakthroughs in this field.

研究の動機と目的

大規模基盤モデルがもたらすリソース上の課題と効率化の必要性を評価する。
性能を犠牲にせず、計算量、メモリ、エネルギー、帯域幅を削減するためのアルゴリズムおよびシステム的アプローチを調査する。
モデルアーキテクチャ、訓練および推論手法、データ管理、デプロイメントシステムの進展を分類する。
言語・視覚・マルチモーダル基盤モデルの知見を橋渡しし、今後の研究と実用的実装を指針とする。

提案手法

言語、視覚、マルチモーダル基盤のアーキテクチャと代表的モデルを整理する。
アテンション、FFN、KVキャッシュの影響を含むコスト要因と効率の課題を分析する。
リソース効率の高いアーキテクチャを要約する（例：効率的なアテンションの変種、Mixture of Experts、潜在空間の拡散最適化）、データ・訓練の工夫。
事前学習、微調整、推論のリソース効率アルゴリズムを概説する（例：データ削減、混合精度、段階的学習、剪定、量子化）。
分散学習からエッジデプロイとサービングに至るリソース効率のあるシステム面を説明する。

Figure 1: The electricity consumption comparison between countries and AI. Data source: [ 77 ] .

実験結果

リサーチクエスチョン

RQ1現在の言語、視覚、マルチモーダル基盤モデルにおける主要なリソースボトルネックは何か？
RQ2訓練とデプロイメントの効率化を実現するアーキテクチャおよびシステムレベルの戦略は何か？
RQ3事前学習、微調整、推論の設計選択がモダリティ間でリソース使用にどう影響するか？
RQ4クラウドからエッジまでリソース効率の高い基盤モデルをデプロイする際の実用的なガイドラインは何か？

主な発見

基盤モデルは高い汎用性を実現する一方で、訓練および提供時に相当なハードウェアとエネルギーコストを要する。
注意機構、データスループット、モデルアーキテクチャ（例：疎/近似アテンション、Mixture of Experts、潜在空間での拡散）を対象とする多様な効率化手法。
リソース適合型の訓練と推論技術（混合精度、データ削減、段階的学習、効率的微調整）は、性能を一様には犠牲にせず計算量とメモリを削減できる。
システムレベルの設計選択（分散学習、フェデレート学習、クラウド対エッジデプロイ）は、実用性とエネルギー利用に決定的な影響を与える。
本調査はアーキテクチャ、アルゴリズム、システム設計を統合し、スケーラブルで持続可能な基盤モデルを目指す今後の研究を導く。

Figure 3: The evolutionary trace of foundation models.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。