QUICK REVIEW

[논문 리뷰] RPIQ: Residual-Projected Multi-Collaboration Closed-Loop and Single Instance Quantization for Visually Impaired Assistance

Xuanyu Wang, Haisen Su|arXiv (Cornell University)|2026. 01. 06.

Multimodal Machine Learning Applications인용 수 0

한 줄 요약

RPIQ는 단일 인스턴스 Hessian 기반 보정으로 블록별 다중 라운드 잔여 보상 양자화 프레임워크를 도입하여 대형 모델의 4비트 양자화를 가능하게 하고, 시각 장애인 보조 작업의 성능을 유지하면서 상당한 메모리 절감을 달성합니다.

ABSTRACT

Visually impaired users face significant challenges in daily information access and real-time environmental perception, and there is an urgent need for intelligent assistive systems with accurate recognition capabilities. Although large-scale models provide effective solutions for perception and reasoning, their practical deployment on assistive devices is severely constrained by excessive memory consumption and high inference costs. Moreover, existing quantization strategies often ignore inter-block error accumulation, leading to degraded model stability. To address these challenges, this study proposes a novel quantization framework -- Residual-Projected Multi-Collaboration Closed-Loop and Single Instance Quantization(RPIQ), whose quantization process adopts a multi-collaborative closed-loop compensation scheme based on Single Instance Calibration and Gauss-Seidel Iterative Quantization. Experiments on various types of large-scale models, including language models such as OPT, Qwen, and LLaMA, as well as vision-language models such as CogVLM2, demonstrate that RPIQ can compress models to 4-bit representation while significantly reducing peak memory consumption (approximately 60%-75% reduction compared to original full-precision models). The method maintains performance highly close to full-precision models across multiple language and visual tasks, and exhibits excellent recognition and reasoning capabilities in key applications such as text understanding and visual question answering in complex scenarios. While verifying the effectiveness of RPIQ for deployment in real assistive systems, this study also advances the computational efficiency and reliability of large models, enabling them to provide visually impaired users with the required information accurately and rapidly.

연구 동기 및 목표

시각 장애인 보조에 사용되는 대형 모델의 양자화 안정성과 정확도 개선.
블록 단위 GPTQ 스타일 양자화에 내재된 블록 간 오차 누적 문제 완화.
양자화 중 보정 데이터 의존성 및 메모리 사용량 감소.
재학습 없이 자원 제약 보조 기기에서 대형 모델의 배치를 가능하게 함.
언어 모델(OPT, Qwen, LLaMA) 및 비전-언어 모델(CogVLM2)에 대한 방법 시연으로 성능 유지.

제안 방법

블록 기반 다중 협업형 폐루프 보정을 잔여물을 사용하여 블록 간 오차 누적을 완화하도록 채택합니다.
단계적 양자화의 두 단계 사용: 1단계는 Hessian 정보를 기반으로 초기 블록 양자화를 얻기 위해 GPTQ 스타일 로컬 최적화를 따른다.
2단계는 메모리 내 글로벌 Hessian을 사용한 다중 라운드의 Gauss-Seidel 유사 잔여 주도 업데이트를 수행하여 블록을 정제합니다.
선정된 미리 계산된 글로벌 Hessian을 보유하고 정제 단계에서 마지막 보정 배치만 사용하는 단일 인스턴스 보정 패러다임을 도입합니다.
블록 업데이트를 안정시키기 위한 스텝 사이즈 alpha를 갖는 선형 업데이트 스킴을 제공합니다.
보정 데이터를 재로딩하지 않고 즉시 Hessian 곡률 재구성을 활용하여 블록별 양자화를 안내합니다.

Figure 1 : Block based multi-collaborative closed-loop compensation.

실험 결과

연구 질문

RQ1잔여 기반 다중 협업 보정이 기존의 원샷 블록별 양자화에 비해 블록 간 양자화 오차 누적을 감소시킬 수 있는가?
RQ2즉시 Hessian 곡률에 기반한 단일 인스턴스 보정이 전체 보정 데이터 재로딩을 피하면서 전역 2차 정보를 보존하는가?
RQ3RPIQ가 시각 장애인 보조와 관련된 작업 성능을 유지하면서 대형 언어 모델과 비전-언어 모델을 4비트 표현으로 얼마나 잘 압축하는가?
RQ4제안된 접근 방식으로 자원 제약 보조 기기에서 어떤 메모리 및 런타임 이점이 발생하는가?

주요 결과

RPIQ는 4비트 양자화를 달성하여 전체 정밀도 모델에 비해 약 60-75%의 피크 메모리 감소를 달성합니다.
본 방법은 다수의 언어 및 시각 작업에서 전체 정밀도 모델에 매우 근접한 성능을 유지합니다.
블록 수준 잔여 협업은 대형 모델에서 블록 간 오차 누적을 효과적으로 완화합니다.
단일 인스턴스 보정은 보정 데이터를 반복적으로 로드하지 않고 전역 2차 정보를 보존하여 효율성을 향상시킵니다.
Gauss-Seidel 형식의 반복 양자화는 보조 시나리오의 대형 모델에서 견고하고 더 빠른 수렴을 제공합니다.

Figure 2 : Single instance calibration paradigm.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.