QUICK REVIEW

[논문 리뷰] Scaling Quantum Machine Learning without Tricks: High-Resolution and Diverse Image Generation

Jonas Jäger, Florian J. Kiwit|arXiv (Cornell University)|2026. 02. 27.

Quantum Computing Algorithms and Architecture인용 수 0

한 줄 요약

본 논문은 차원 축소나 패치 없이 풀 해상도이고 다양성 있는 MNIST, Fashion-MNIST, SVHN 이미지를 생성하기 위해 단일 엔드투엔드 양자 워터스타인 GAN을 학습시키며, 작업 특화 양자 회로 설계와 다중 모드 노이즈를 활용해 샷 노이즈 하에서 고품질의 결과를 달성한다.

ABSTRACT

Quantum generative modeling is a rapidly evolving discipline at the intersection of quantum computing and machine learning. Contemporary quantum machine learning is generally limited to toy examples or heavily restricted datasets with few elements. This is not only due to the current limitations of available quantum hardware but also due to the absence of inductive biases arising from application-agnostic designs. Current quantum solutions must resort to tricks to scale down high-resolution images, such as relying heavily on dimensionality reduction or utilizing multiple quantum models for low-resolution image patches. Building on recent developments in classical image loading to quantum computers, we circumvent these limitations and train quantum Wasserstein GANs on the established classical MNIST and Fashion-MNIST datasets. Using the complete datasets, our system generates full-resolution images across all ten classes and establishes a new state-of-the-art performance with a single end-to-end quantum generator without tricks. As a proof-of-principle, we also demonstrate that our approach can be extended to color images, exemplified on the Street View House Numbers dataset. We analyze how the choice of variational circuit architecture introduces inductive biases, which crucially unlock this performance. Furthermore, enhanced noise input techniques enable highly diverse image generation while maintaining quality. Finally, we show promising results even under quantum shot noise conditions.

연구 동기 및 목표

패칭이나 차원 축소와 같은 트릭 없이 표준 벤치마크에서 풀 해상도의 엔드투엔드 양자 이미지 생성을 입증한다.
응용 특화 양자 회로 설계(유도 편향)가 확장 가능하고 다양하며 고품질의 이미지 생성을 가능하게 한다는 것을 보여준다.
다중 모드 노이즈 입력과 샷 노이즈가 성능과 다양성에 미치는 영향을 조사한다.
작업에 맞춰 정렬된 회로 아키텍처가 일반적이고 작업 비특화된 설계보다 성능이 우수하다는 실험적 증거를 제공한다.

제안 방법

양자 생성기와 고전적 판별기로 구성된 양자 GAN을 워터스타인-GAN 프레임워크(WGAN-GP)에서 사용한다.
차원 축소 없이 풀 크기 이미지를 생성할 수 있도록 FRQI 관련 표현을 사용해 이미지를 인코딩한다.
다양한 생성을 만들고 모드 붕괴를 피하기 위해 학습 가능한 다중 모드 노이즈 입력을 도입한다.
FRQI 인코딩에 맞춘 레이어드 노이즈 업로드, 주소 큐비트 얽힘, 색 큐비트 회전을 포함하는 작업 특화 양자 회로 앙상즈를 설계한다.
양자 상태를 이미지로 해독하고 고전적 판별기를 학습시켜 Wasserstein 손실을 통한 그래디언트 신호를 제공한다.
MNIST, Fashion-MNIST, SVHN(컬러)에서 품질(FID)과 다양성을 평가하며 샷 노이즈 고려사항을 포함한다.

Figure 1 : Overview of the proposed QGAN generator and training workflow for a $4\times 4$ -pixel grayscale image. (1) Noise Sampling: a multimodal latent distribution is formed by uniformly sampling a discrete mode index $m\in\{1,2\}$ and drawing Gaussian noise $\varepsilon_{a}\sim\mathcal{N}(0,1)$

실험 결과

연구 질문

RQ1패칭이나 차원 축소와 같은 트릭 없이 표준 벤치마크에서 풀 해상도의 엔드투엔드 양자 이미지 생성을 입증할 수 있는가?
RQ2작업 특화 양자 회로 설계와 FRQI 유사 인코딩이 확장 가능한 양자 이미지 생성에 필요한 귀납적 편향을 제공하는가?
RQ3다중 모드 노이즈 입력과 샷-노이즈 조건이 양자 생성 모델의 이미지 품질과 다양성에 어떤 영향을 미치는가?
RQ4아키텍처 선택이 이전의 패치 기반 또는 비특화된 QGAN 접근법에 비해 성능에 어느 정도 영향을 미치는가?

주요 결과

대형 QGAN(64 레이어, 40 모드의 노이즈)이 모든 10개 MNIST 및 Fashion-MNIST 클래스를 높은 시각적 품질과 풍부한 클래스 내 다양성으로 생성함(FID: MNIST 118, Fashion-MNIST 91, SVHN 84).
작업 특화 생성기 설계와 FRQI 인코딩이 작업 비특화 및 진폭 기반 구성보다 현저히 우수하며, 경계가 명확하고 포화 균형이 더 잘 잡힌 이미지를 생성한다.
다중 모드 노이즈에 학습 가능한 튜닝을 도입하면 클래스 내 변이가 증가하고 모드 혼합이 감소하며, 단일 모드 및 고정 다중 모드 설정보다 성능이 우수하다(FID 개선이 속성 분해에서 나타남).
클래스당 다중 모드 노이즈(overmoding)는 클래스 내 다양성을 향상시키고 더 미세한 하위 클래스를 드러낼 수 있다(예: 부츠와 드레스가 뚜렷한 모드를 보임).
유한 샷 노이즈 하에서의 학습은 픽셀 정보를 보존하고 주소 큐비트에서 가장자리 확률을 더 견고하고 균일하게 분포시키며 하드웨어 확장성을 돕는다.

Figure 2 : Illustration of multimodal noise modeling (left to right). Quantum circuit perspective of implementing a bimodal mixture distribution via controlled rotations sampling the classical bit $m$ uniformly and $\varepsilon$ normally (unimodal). $z_{0}$ and $z_{1}$ denote the tuned noise (shifte

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.