QUICK REVIEW

[논문 리뷰] When Visual Privacy Protection Meets Multimodal Large Language Models

Hui, Xiaofei, Wu, Qian|arXiv (Cornell University)|2026. 03. 14.

Multimodal Machine Learning Applications인용 수 0

한 줄 요약

논문은 모델이 블랙 박스일 때 다중모달 LLMs의 시각 프라이버시 보호를 조사하고, 프라이버시와 MLLM 성능의 균형을 맞추는 Pareto-optimal learning objective와 history-enhanced optimization을 제안하며, 벤치마크에서의 실험으로 효과를 보임.

ABSTRACT

The emergence of Multimodal Large Language Models (MLLMs) and the widespread usage of MLLM cloud services such as GPT-4V raised great concerns about privacy leakage in visual data. As these models are typically deployed in cloud services, users are required to submit their images and videos, posing serious privacy risks. However, how to tackle such privacy concerns is an under-explored problem. Thus, in this paper, we aim to conduct a new investigation to protect visual privacy when enjoying the convenience brought by MLLM services. We address the practical case where the MLLM is a "black box", i.e., we only have access to its input and output without knowing its internal model information. To tackle such a challenging yet demanding problem, we propose a novel framework, in which we carefully design the learning objective with Pareto optimality to seek a better trade-off between visual privacy and MLLM's performance, and propose critical-history enhanced optimization to effectively optimize the framework with the black-box MLLM. Our experiments show that our method is effective on different benchmarks.

연구 동기 및 목표

Motivate the privacy risks of visual data in MLLM cloud services.
Develop a privacy-preserving framework usable even when the MLLM is a black box.
Balance visual privacy protection with maintaining MLLM task performance.
Propose optimization techniques that are effective under limited model access (black-box).
Demonstrate effectiveness across multiple benchmarks.

제안 방법

Design a Pareto-optimal learning objective to trade off privacy and MLLM performance.
Introduce critical-history enhanced optimization to optimize the framework with black-box MLLMs.
Apply the framework to visual data privacy protection in MLLM scenarios.
Evaluate on diverse benchmarks to show generalizability.

실험 결과

연구 질문

RQ1Can visual privacy protection be effectively achieved for MLLMs when model internals are inaccessible (black-box setting)?
RQ2What is the trade-off between privacy protection strength and MLLM task performance under Pareto optimization?
RQ3Does incorporating critical-history information improve optimization convergence and protection quality in black-box MLLMs?

주요 결과

The proposed approach achieves effective privacy protection while maintaining MLLM performance on benchmarks.
Pareto-optimal objective helps balance competing goals of privacy and utility.
Critical-history enhanced optimization improves optimization outcomes in black-box settings.
The method demonstrates robustness across different benchmark scenarios.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.