QUICK REVIEW

[論文レビュー] AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

Zhi‐Wei Liu, Weiran Yao|arXiv (Cornell University)|Feb 23, 2024

Multi-Agent Systems and Negotiation被引用数 10

ひとこと要約

AgentLite は、タスク指向のLLMエージェントおよびマルチエージェントシステムのプロトタイピングと評価のための、軽量でオープンソースのフレームワークを提供します。プロンプト、メモリ、アクション、アーキテクチャのカスタマイズを容易に行えます。ベンチマークや多様なアプリケーションを示して、柔軟性と性能を示します。

ABSTRACT

The booming success of LLMs initiates rapid development in LLM agents. Though the foundation of an LLM agent is the generative model, it is critical to devise the optimal reasoning strategies and agent architectures. Accordingly, LLM agent research advances from the simple chain-of-thought prompting to more complex ReAct and Reflection reasoning strategy; agent architecture also evolves from single agent generation to multi-agent conversation, as well as multi-LLM multi-agent group chat. However, with the existing intricate frameworks and libraries, creating and evaluating new reasoning strategies and agent architectures has become a complex challenge, which hinders research investigation into LLM agents. Thus, we open-source a new AI agent library, AgentLite, which simplifies this process by offering a lightweight, user-friendly platform for innovating LLM agent reasoning, architectures, and applications with ease. AgentLite is a task-oriented framework designed to enhance the ability of agents to break down tasks and facilitate the development of multi-agent systems. Furthermore, we introduce multiple practical applications developed with AgentLite to demonstrate its convenience and flexibility. Get started now at: \url{https://github.com/SalesforceAIResearch/AgentLite}.

研究の動機と目的

LLMエージェントの推論戦略とアーキテクチャをプロトタイプするために、軽量で研究に適したライブラリが必要であることを動機づける。
マルチエージェントのオーケストレーションと実験を促進する、シンプルなタスク指向のフレームワークを提供する。
ベンチマークや多様なアプリケーションを通じて実用的な適用性を示す。
AgentLite が異なるLLMバックボーンとシナリオ全体での容易な統合と評価をサポートすることを示す。

提案手法

階層的なマルチエージェント orchestration のための、4モジュールの Individual Agent（PromptGen、Actions、LLM、Memory）と Manager Agent を導入する。
TaskPackage (TP) を Manager とチームエージェント間の通信単位として定義し、その性質を説明する。
Action モジュールを拡張して新しい推論タイプ（Think、ReAct風のステップなど）を追加する方法を説明し、Think アクションのコードスケッチを示す。
Actions、チーム編成、LLMバックエンドを設定することにより、Copilot Agent、Copilot Multi-Agent、Multi-LLM Multi-Agentといった新しいエージェントアーキテクチャを実装する方法を説明する。

実験結果

リサーチクエスチョン

RQ1軽量なフレームワークは、新しいLLMエージェントの推論戦略とアーキテクチャの開発と評価をどのように加速できるか。
RQ2タスク指向で階層的なマルチエージェント設計は、LLMエージェントのモジュール性と実験の柔軟性を改善できるか。
RQ3HotPotQA や Webshop のような確立されたエージェントベンチマークに対して、異なるLLMバックボーンで AgentLite はどのように性能を発揮するか。
RQ4AgentLite の汎用性を分野横断で示すために、どのようなアプリケーション群を容易に構築できるか。

主な発見

LLM	Easy F1	Easy Accuracy	Medium F1	Medium Accuracy	Hard F1	Hard Accuracy
GPT-3.5-Turbo-16k-0613	0.410	0.35	0.330	0.25	0.283	0.20
GPT-4-0613	0.611	0.47	0.610	0.48	0.527	0.38
GPT-4-32k-0613	0.625	0.46	0.644	0.54	0.520	0.37
xLAM-v0.1	0.532	0.45	0.547	0.46	0.455	0.36

AgentLite は階層的なマネージャー-エージェント構成によるマルチエージェントオーケストレーションを可能にする。
このフレームワークは、アクションを拡張し、推論とツール使用を統合することで新しい推論タイプを追加することをサポートする（例：Think をアクションとして）。
AgentLite はベンチマークで競争力のある performance を示し、GPT-4系のバリアントや xLAM-v0.1 を含む複数のLLMバックボーンをサポートする。
HotPotQA の実験では GPT-4 のバリアントが GPT-3.5 を上回り、GPT-4-32k-0613 が中程度レベルの F1 と精度でより高い値を達成する。xLAM-v0.1 もこの設定で GPT-3.5 を上回る改善を示す。
Webshop では GPT-4-32k が平均報酬をより高く達成し、文脈長の利点を示唆する。xLAM-v0.1 はこの環境でも GPT-3.5 と競合力を維持する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。