QUICK REVIEW

[论文解读] When Visual Privacy Protection Meets Multimodal Large Language Models

Hui, Xiaofei, Wu, Qian|arXiv (Cornell University)|Mar 14, 2026

Multimodal Machine Learning Applications被引用 0

一句话总结

该论文在模型为黑盒的情况下研究保护多模态大模型（MLLMs）的视觉隐私，提出帕累托最优学习目标和历史信息增强优化以在隐私与MLLM性能之间取得平衡，实验在基准数据集上显示出有效性。

ABSTRACT

The emergence of Multimodal Large Language Models (MLLMs) and the widespread usage of MLLM cloud services such as GPT-4V raised great concerns about privacy leakage in visual data. As these models are typically deployed in cloud services, users are required to submit their images and videos, posing serious privacy risks. However, how to tackle such privacy concerns is an under-explored problem. Thus, in this paper, we aim to conduct a new investigation to protect visual privacy when enjoying the convenience brought by MLLM services. We address the practical case where the MLLM is a "black box", i.e., we only have access to its input and output without knowing its internal model information. To tackle such a challenging yet demanding problem, we propose a novel framework, in which we carefully design the learning objective with Pareto optimality to seek a better trade-off between visual privacy and MLLM's performance, and propose critical-history enhanced optimization to effectively optimize the framework with the black-box MLLM. Our experiments show that our method is effective on different benchmarks.

研究动机与目标

Motivate the privacy risks of visual data in MLLM cloud services.
Develop a privacy-preserving framework usable even when the MLLM is a black box.
Balance visual privacy protection with maintaining MLLM task performance.
Propose optimization techniques that are effective under limited model access (black-box).
Demonstrate effectiveness across multiple benchmarks.

提出的方法

Design a Pareto-optimal learning objective to trade off privacy and MLLM performance.
Introduce critical-history enhanced optimization to optimize the framework with black-box MLLMs.
Apply the framework to visual data privacy protection in MLLM scenarios.
Evaluate on diverse benchmarks to show generalizability.

实验结果

研究问题

RQ1Can visual privacy protection be effectively achieved for MLLMs when model internals are inaccessible (black-box setting)?
RQ2What is the trade-off between privacy protection strength and MLLM task performance under Pareto optimization?
RQ3Does incorporating critical-history information improve optimization convergence and protection quality in black-box MLLMs?

主要发现

The proposed approach achieves effective privacy protection while maintaining MLLM performance on benchmarks.
Pareto-optimal objective helps balance competing goals of privacy and utility.
Critical-history enhanced optimization improves optimization outcomes in black-box settings.
The method demonstrates robustness across different benchmark scenarios.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。