QUICK REVIEW

[논문 리뷰] Reusable MLOps: Reusable Deployment, Reusable Infrastructure and Hot-Swappable Machine Learning models and services

Deven Panchal, Priyanka Verma|arXiv (Cornell University)|2024. 02. 19.

Distributed and Parallel Computing Systems인용 수 5

한 줄 요약

논문은 Acumos AI 플랫폼을 통한 재사용 가능한 MLOps를 제시하여 재사용 가능한 배포, 재사용 가능한 인프라, 핫 스왑 가능한 ML 모델을 통해 인프라를 해체하지 않고도 모델을 지속적으로 프로덕션화합니다.

ABSTRACT

Although Machine Learning model building has become increasingly accessible due to a plethora of tools, libraries and algorithms being available freely, easy operationalization of these models is still a problem. It requires considerable expertise in data engineering, software development, cloud and DevOps. It also requires planning, agreement, and vision of how the model is going to be used by the business applications once it is in production, how it is going to be continuously trained on fresh incoming data, and how and when a newer model would replace an existing model. This leads to developers and data scientists working in silos and making suboptimal decisions. It also leads to wasted time and effort. We introduce the Acumos AI platform we developed and we demonstrate some unique novel capabilities that the Acumos model runner possesses, that can help solve the above problems. We introduce a new sustainable concept in the field of AI/ML operations - called Reusable MLOps - where we reuse the existing deployment and infrastructure to serve new models by hot-swapping them without tearing down the infrastructure or the microservice, thus achieving reusable deployment and operations for AI/ML models while still having continuously trained models in production.

연구 동기 및 목표

ML 파이프라인 및 데이터 과학자와 개발자 간의 사일로와 운영화 과제를 동기부여합니다.
다양한 모델에 대해 배포 및 인프라를 재사용하기 위한 재사용 가능한 MLOps 개념을 소개합니다.
Acumos가 프로덕션 모델의 다운타임 없이 핫 스왑 및 지속적 재학습을 가능하게 하는 방법을 demonstrate합니다.
모델 공유와 재사용을 촉진하기 위한 거버넌스, 라이선스, 마켓플레이스 측면을 강조합니다.

제안 방법

온보딩 및 배포를 위한 Acumos 플랫폼과 구성 요소(Model Runner, Java client, Design Studio)의 설명.
모델이 어떻게 Exported되는지(MOJO zip, jar) 및 Model Runner를 통해 공통의 Acumos 마이크로서비스로 래핑되는지 설명합니다.
Protobuf 직렬화가 저지연, 언어 중립적 데이터 교환 및 JVM의 동적 클래스 로딩을 어떻게 달성하는지 보여줍니다.
Acumos Model Runner의 API 표면과 엔드포인트가 어떻게 모델 교체, proto 업데이트, 실행 중 동작 변경을 가능하게 하는지 상세히 설명합니다.
Kubernetes, AWS, Azure, GCP 혹은 독립 실행형 Docker 배포 및 서비스 간의 모델 재사용 경로를 Illustration합니다.

실험 결과

연구 질문

RQ1서비스를 해체하지 않고 여러 ML 모델에 걸쳐 모델 배포 및 인프라를 재사용할 수 있는 방법은 무엇인가?
RQ2핫 스왑 모델이나 운영 중 모델 동작을 변경하기 위해 Acumos Model Runner가 제공하는 기능은 무엇인가?
RQ3ML 파이프라인이 운영 환경에서 지속적 학습과 매끄러운 모델 교체를 어떻게 지원하는가?
RQ4재사용 가능한 MLOps 프레임워크에서 ML 모델의 공유 및 수익화를 뒷받침하는 거버넌스, 라이선스, 연합 메커니즘은 무엇인가?

주요 결과

Acumos Model Runner는 다운타임 없이 기존 마이크로서비스 내에서 모델의 핫스왑을 가능하게 한다.
모델은 공통의 Acumos 마이크로서비스로 래핑되며 풍부한 API 표면을 통해 업데이트나 교체가 가능하다.
Protobuf 기반 직렬화는 저지연, 다언어 간 데이터 교환 및 모델 아티팩트를 위한 동적 클래스 로딩을 지원한다.
플랫폼은 일반적인 클라우드 및 컨테이너 런타임에 호스팅된 모델 배포를 지원하여 재사용 가능한 배포 및 인프라 재사용을 촉진한다.
Design Studio 및 Java 클라이언트는 H2O, Java, 또는 Spark로부터 프로덕션 준비된 마이크로서비스로의 모델 온보딩을 간소화한다.

Figure 2: A Machine Learning model being onboarded to Acumos

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.