QUICK REVIEW

[论文解读] The Capacity of Private Information Retrieval with Partially Known Private Side Information

Yi-Peng Wei, Karim Banawan|arXiv (Cornell University)|Oct 2, 2017

Cryptography and Data Security参考文献 33被引用 21

一句话总结

本文研究了具有部分已知私有边信息的私有信息检索（PIR），其中用户从N个非共谋数据库中缓存消息，并在不泄露目标消息身份或未缓存边信息身份的情况下检索所需消息。作者推导出精确的PIR容量为 $ C = \left(1 + \frac{1}{N} + \cdots + \frac{1}{N^{K-M-1}}\right)^{-1} = \frac{1 - \frac{1}{N}}{1 - \left(\frac{1}{N}\right)^{K-M}} $，表明当相同数据库同时用于预取和检索阶段时，性能无损失。

ABSTRACT

We consider the problem of private information retrieval (PIR) of a single message out of $K$ messages from $N$ replicated and non-colluding databases where a cache-enabled user (retriever) of cache-size $M$ possesses side information in the form of full messages that are partially known to the databases. In this model, the user and the databases engage in a two-phase scheme, namely, the prefetching phase where the user acquires side information and the retrieval phase where the user downloads desired information. In the prefetching phase, the user receives $m_n$ full messages from the $n$th database, under the cache memory size constraint $\sum_{n=1}^N m_n \leq M$. In the retrieval phase, the user wishes to retrieve a message such that no individual database learns anything about the identity of the desired message. In addition, the identities of the side information messages that the user did not prefetch from a database must remain private against that database. Since the side information provided by each database in the prefetching phase is known by the providing database and the side information must be kept private against the remaining databases, we coin this model as extit{partially known private side information}. We characterize the capacity of the PIR with partially known private side information to be $C=\left(1+\frac{1}{N}+\cdots+\frac{1}{N^{K-M-1}} ight)^{-1}=\frac{1-\frac{1}{N}}{1-(\frac{1}{N})^{K-M}}$. Interestingly, this result is the same if none of the databases knows any of the prefetched side information, i.e., when the side information is obtained externally, a problem posed by Kadhe et al. and settled by Chen-Wang-Jafar recently. Thus, our result implies that there is no loss in using the same databases for both prefetching and retrieval phases.

研究动机与目标

建立并分析两阶段系统中的私有信息检索（PIR）模型，用户从数据库中预取边信息，并在之后私密地检索所需消息。
解决对单个数据库而言，目标消息身份和未缓存边信息身份的隐私保护问题。
刻画当数据库对用户缓存内容具有部分知识时，PIR的信息论容量。
确定在使用相同数据库进行预取和检索时，与外部缓存相比是否会导致容量损失。

提出的方法

形式化两阶段PIR模型：预取阶段（用户从数据库 $ n $ 中缓存 $ m_n $ 条消息，满足 $ \sum m_n \leq M $）和检索阶段（用户私密地检索一条消息）。
施加隐私约束：任一数据库均无法获知目标消息的身份，也无法获知其未预取的 $ M - m_n $ 条消息的身份。
提出一种利用边信息的MDS编码查询可实现方案，以最小化下载开销，同时保持隐私。
通过组合查询设计推导归一化下载开销：$ p = \frac{1}{N-1}(N^{K-m} - 1) $，$ q = \frac{1}{N-1}(N^{(N-1)m} - 1) $，以及 $ L = N^{K-m} $ 表示目标消息。
利用归纳法和信息论不等式建立反证证明，以匹配下界。
与先前工作比较域大小和子分组化需求，表明所提方案具有优势。

实验结果

研究问题

RQ1当数据库对用户缓存的边信息具有部分知识时，PIR的容量是多少？
RQ2与外部缓存相比，未缓存边信息消息的隐私性是否会造成性能损失？
RQ3能否在不损失PIR容量的前提下，使用相同的数据库同时完成预取和检索？
RQ4部分已知边信息的结构如何影响可实现的下载开销？
RQ5在数据库已知、未知或部分已知用户边信息的模型之间，是否存在容量的根本性差异？

主要发现

具有部分已知私有边信息的PIR容量为 $ C = \left(1 + \frac{1}{N} + \cdots + \frac{1}{N^{K-M-1}}\right)^{-1} = \frac{1 - \frac{1}{N}}{1 - \left(\frac{1}{N}\right)^{K-M}} $，与完全未知边信息的PIR容量一致。
只要满足缓存内存约束 $ \sum m_n \leq M $ 且取等号，该容量可被实现，与具体的预取策略无关。
所提方案实现了与采用外部缓存的先前工作的相同容量，表明使用相同数据库进行预取和检索不会造成性能损失。
提出一种新型MDS编码检索方案，其域大小和子分组化需求小于先前方案，尤其在均匀预取条件下优势更明显。
结果表明，数据库对部分边信息的知晓不会降低PIR容量，使得共享数据库在缓存和检索中具有可行性。
反证证明确认了所推导容量的最优性，使用了归纳法和信息论界（如Han不等式）

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。