Skip to main content
QUICK REVIEW

[论文解读] U-MASK: User-adaptive Spatio-Temporal Masking for Personalized Mobile AI Applications

Shiyuan Zhang, Yilai Liu|arXiv (Cornell University)|Jan 11, 2026
Recommender Systems and Techniques被引用 0
一句话总结

简要:U-MASK 提出了一种面向个性化移动 AI 的用户和任务自适应时空掩蔽框架,使用 U-SCOPE 进行语义化用户表示,并采用基于扩散的骨干网络在短期、长期和冷启动任务中实现基于掩蔽的条件完成。

ABSTRACT

Personalized mobile artificial intelligence applications are widely deployed, yet they are expected to infer user behavior from sparse and irregular histories under a continuously evolving spatio-temporal context. This setting induces a fundamental tension among three requirements, i.e., immediacy to adapt to recent behavior, stability to resist transient noise, and generalization to support long-horizon prediction and cold-start users. Most existing approaches satisfy at most two of these requirements, resulting in an inherent impossibility triangle in data-scarce, non-stationary personalization. To address this challenge, we model mobile behavior as a partially observed spatio-temporal tensor and unify short-term adaptation, long-horizon forecasting, and cold-start recommendation as a conditional completion problem, where a user- and task-specific mask specifies which coordinates are treated as evidence. We propose U-MASK, a user-adaptive spatio-temporal masking method that allocates evidence budgets based on user reliability and task sensitivity. To enable mask generation under sparse observations, U-MASK learns a compact, task-agnostic user representation from app and location histories via U-SCOPE, which serves as the sole semantic conditioning signal. A shared diffusion transformer then performs mask-guided generative completion while preserving observed evidence, so personalization and task differentiation are governed entirely by the mask and the user representation. Experiments on real-world mobile datasets demonstrate consistent improvements over state-of-the-art methods across short-term prediction, long-horizon forecasting, and cold-start settings, with the largest gains under severe data sparsity. The code and dataset will be available at https://github.com/NICE-HKU/U-MASK.

研究动机与目标

  • Motivate solving the impossibility triangle in personalized mobile AI: immediacy, stability, and generalization under sparse histories.
  • Formulate mobile user behavior as a conditional completion problem guided by a learnable mask.
  • Develop a compact, task-agnostic user representation to condition masking.
  • Integrate a shared diffusion-based backbone to perform mask-guided generative completion.
  • Demonstrate improvements over state-of-the-art methods on real-world mobile datasets with sparse data.

提出的方法

  • Introduce U-MASK, a masking mechanism that defines inference via evidence regions conditioned on user and task.
  • Develop U-SCOPE to produce a compact, task-agnostic user embedding from sparse app-location histories.
  • Instantiate a conditional diffusion transformer backbone (DiT) that reconstructs missing regions while preserving observed evidence.
  • Compute a task- and user-specific mask by balancing observation budgets, temporal emphasis, and spatial affinity using a hierarchical latent representation.
  • Train end-to-end with a reconstruction loss and an InfoNCE-based consistency term to align short-term and long-term representations.
  • Allow task differentiation and personalization to be governed entirely by the mask and user representation rather than task-specific heads.

实验结果

研究问题

  • RQ1How can we unify short-term prediction, long-horizon forecasting, and cold-start recommendation under sparse mobile histories?
  • RQ2Can a learnable, user-adaptive masking mechanism define inference regions that adapt to task goals and user behavior?
  • RQ3Does a diffusion-based generative backbone effectively complete partially observed spatio-temporal mobile data conditioned on masks and user semantics?
  • RQ4What is the impact of the semantic user representation (U-SCOPE) on masking quality and personalization under data sparsity?

主要发现

  • U-MASK achieves consistent improvements over state-of-the-art methods across short-term, long-horizon, and cold-start scenarios.
  • Performance gains are largest under severe data sparsity and dynamic contexts.
  • A diffusion transformer backbone preserves observed evidence while generating missing regions conditioned on the mask and user semantics.
  • U-SCOPE provides robust, task-agnostic user representations that enable effective masking in sparse telemetry.
  • The approach demonstrates real-world applicability with real mobile datasets and a plan to release code and data.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。