Skip to main content
QUICK REVIEW

[논문 리뷰] Verifiable evaluations of machine learning models using zkSNARKs

Tobin South, Alexander Camuto|arXiv (Cornell University)|2024. 02. 05.
Neural Networks and Applications인용 수 6
한 줄 요약

논문은 private weights를 가진 ML 모델의 검증 가능한 평가를 생성하는 zkSNARK 기반 프레임워크를 제시하여 공개 입력에 대한 성능 및 공정성 증명을 가능하게 한다.

ABSTRACT

In a world of increasing closed-source commercial machine learning models, model evaluations from developers must be taken at face value. These benchmark results-whether over task accuracy, bias evaluations, or safety checks-are traditionally impossible to verify by a model end-user without the costly or impossible process of re-performing the benchmark on black-box model outputs. This work presents a method of verifiable model evaluation using model inference through zkSNARKs. The resulting zero-knowledge computational proofs of model outputs over datasets can be packaged into verifiable evaluation attestations showing that models with fixed private weights achieve stated performance or fairness metrics over public inputs. We present a flexible proving system that enables verifiable attestations to be performed on any standard neural network model with varying compute requirements. For the first time, we demonstrate this across a sample of real-world models and highlight key challenges and design solutions. This presents a new transparency paradigm in the verifiable evaluation of private models.

연구 동기 및 목표

  • 모델 성능이 비공개일 때도 투명한 검증이 필요함을 동기화.
  • 신경망 및 기타 모델에 대해 검증 가능 평가 확증을 생성하는 일반 프레임워크를 제안.
  • 최종 사용자가 비공개 가중치로 벤치마크를 재실행하지 않고도 모델 출력 및 주장된 지표를 검증할 수 있게 함.
  • 배포된 모델이 성능 주장과 일치하는지 확인하기 위한 챌런지 기반 검증을 가능하게 함.

제안 방법

  • 학습된 모델을 ONNX로 변환하고 추론 회로를 도출한다.
  • 모델에서 zkSNARK 증명 및 검증 키를 설정하여 추론 증명을 가능하게 한다.
  • 각 데이터 포인트 (x,y)에 대해 출력 및 지표를 증명하기 위한 증인 파일과 zkSNARK 증명을 생성한다.
  • 개별 증명을 모델 가중치와 연결된 H(W)로 합쳐 검증 가능한 평가 확증으로 제공한다.
  • 주어진 입력-출력 쌍이 주장된 가중치 해시로 모델에 의해 생성될 수 있는지 확인하는 사후 도전(challenge-based) 검증을 허용한다.
Figure 1: System diagram of verifiable ML evaluation using the zkSNARK ezkl toolkit. A model can be compiled into a proving key ( $pk$ ) and verification key ( $vk$ ) which can be used to generate repeated inference proofs over a dataset ( $\pi$ ), which can then be aggregated into a verifiable eval
Figure 1: System diagram of verifiable ML evaluation using the zkSNARK ezkl toolkit. A model can be compiled into a proving key ( $pk$ ) and verification key ( $vk$ ) which can be used to generate repeated inference proofs over a dataset ( $\pi$ ), which can then be aggregated into a verifiable eval

실험 결과

연구 질문

  • RQ1How can verifiable attestations be constructed for diverse model architectures via zkSNARKs?
  • RQ2What are the computational costs and scalability limits when proving inferences across datasets of varying size?
  • RQ3How can verifiable evaluations cover performance and fairness (bias) checks across different model types?

주요 결과

모델;;모델;; 매개변수;; 모델;; 플롭스;; 모델;; 제약;; 증명;; 시간(초);; 검증;; 시간(초);; 증명;; 크기;; PK;; 크기;; VK;; 크기
Linear Regression30620.10.0113K715K1.7K
SVM306260.30.0223K16M2.5K
Random Forest8036272.90.0226K276M2.7K
MLP3610350019200.30.0221K14M2.3K
Small CNN1987068600358443.10.0315K390M1.8K
VAE (decoder)10657471258291220166321420.421.9M16G2.5K
LSTM29184950272495712350.1041K4.1G2.5K
nanoGPT25062451396608939893627812.690.7M219G4.2K
  • Verifiable attestations can be produced for a range of models from MLPs and CNNs to small transformers using the ezkl toolkit.
  • Proofs are scalable in size and verification time remains small, while the proving key grows with model complexity.
  • Increasing model size raises proving time and resource requirements but preserves compact proof and verification keys.
  • A practical end-to-end workflow is demonstrated across multiple model types, including bias/fairness checks.
  • Challenging a model to prove inference consistency with a weight hash enables trustless verification without revealing private weights.
Figure 2: Time and storage requirements for model inference proofs for increasing model sizes across multi-layered perceptions (MLP), convolutional neural networks (CNN), and attention-based transformers (Attn). Model requirements scale linearly with the number of constraints in the proof, which is
Figure 2: Time and storage requirements for model inference proofs for increasing model sizes across multi-layered perceptions (MLP), convolutional neural networks (CNN), and attention-based transformers (Attn). Model requirements scale linearly with the number of constraints in the proof, which is

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.