QUICK REVIEW

[论文解读] Querying a Matrix Through Matrix-Vector Products

Xiaoming Sun, David P. Woodruff|arXiv (Cornell University)|Jan 1, 2019

Stochastic Gradient Optimization Techniques参考文献 37被引用 3

一句话总结

本文提出一种新颖的计算模型，其中算法仅通过矩阵-向量乘积 M·v_i 访问未知矩阵 M，分析解决线性代数、统计学和图论中基本问题所需的查询次数。研究建立了紧致的界限，例如在邻接矩阵中连通性和三角形检测的查询复杂度为 Ω(n/log n)，表明自适应查询和单边查询对查询复杂度有显著影响，同时揭示了基于域类型和图的矩阵表示形式的分离现象。

ABSTRACT

We consider algorithms with access to an unknown matrix $M\in\mathbb{F}^{n imes d}$ via matrix-vector products, namely, the algorithm chooses vectors $\mathbf{v}^1, \ldots, \mathbf{v}^q$, and observes $M\mathbf{v}^1,\ldots, M\mathbf{v}^q$. Here the $\mathbf{v}^i$ can be randomized as well as chosen adaptively as a function of $ M\mathbf{v}^1,\ldots,M\mathbf{v}^{i-1}$. Motivated by applications of sketching in distributed computation, linear algebra, and streaming models, as well as connections to areas such as communication complexity and property testing, we initiate the study of the number $q$ of queries needed to solve various fundamental problems. We study problems in three broad categories, including linear algebra, statistics problems, and graph problems. For example, we consider the number of queries required to approximate the rank, trace, maximum eigenvalue, and norms of a matrix $M$; to compute the AND/OR/Parity of each column or row of $M$, to decide whether there are identical columns or rows in $M$ or whether $M$ is symmetric, diagonal, or unitary; or to compute whether a graph defined by $M$ is connected or triangle-free. We also show separations for algorithms that are allowed to obtain matrix-vector products only by querying vectors on the right, versus algorithms that can query vectors on both the left and the right. We also show separations depending on the underlying field the matrix-vector product occurs in. For graph problems, we show separations depending on the form of the matrix (bipartite adjacency versus signed edge-vertex incidence matrix) to represent the graph. Surprisingly, this fundamental model does not appear to have been studied on its own, and we believe a thorough investigation of problems in this model would be beneficial to a number of different application areas.

研究动机与目标

研究在仅能访问 M·v 查询时，解决基本问题所需的最少矩阵-向量查询次数。
理解自适应性、域选择以及查询方向（左/右）对查询复杂度的影响。
建立不同模型之间的分离：单边查询与双侧查询，以及图的不同矩阵表示形式。
将此模型与流式计算、Sketching、压缩感知和通信复杂度等应用联系起来。
为关键问题（如秩近似、矩阵范数和图连通性）提供紧致的上下界。

提出的方法

该模型允许自适应的、随机化的查询 v_i，通过访问 M·v_i，其中 M 是域 F 上的未知矩阵。
对于下界，论文将问题约化为两个参与者之间的通信复杂度问题（例如集合不相交性、三角形计数）。
对于上界，利用已知的 Sketching 和稀疏化结果（例如来自 [21]）来构建高效的查询策略。
分析了不同的矩阵表示形式：图的二分邻接矩阵和带符号的边-顶点关联矩阵。
理论工具包括线性代数、谱图理论，以及图拉普拉斯矩阵和稀疏化器的性质。
通过比较不同模型（单边与双侧查询，以及不同域）下的查询复杂度，证明了分离性。

实验结果

研究问题

RQ1在因子 t 内近似 n×n 矩阵 M 的秩，所需的最少矩阵-向量查询次数是多少？
RQ2需要多少查询才能判断由矩阵表示的图是否连通或包含三角形？
RQ3单边查询（仅右乘法）是否显著弱于双侧查询（左乘和右乘）？
RQ4基础域（例如实数域与有限域）如何影响矩阵问题的查询复杂度？
RQ5在此模型下，计算矩阵范数（如 Schatten-p 范数）的查询复杂度是多少？

主要发现

在因子 t 内近似 n×n 矩阵 M 的秩，恰好需要 n/t + 1 次查询，且该界对随机化和自适应算法均紧致。
即使对于成功概率为常数的随机化算法，使用二分邻接矩阵判断图的连通性也需要 Ω(n/log n) 次查询。
在邻接矩阵中检测三角形需要 Ω(n/log n) 次查询，与两参与者通信复杂度中三角形计数的下界一致。
对于同一图，当使用带符号的边-顶点关联矩阵时，连通性可仅用 polylog(n) 次非自适应查询判断，显示出强烈的表示依赖性分离。
当查询被限制在右侧（M·v）或可同时在两侧进行（u^T·M 和 M·v）时，查询复杂度存在显著差异，证明了明确的分离。
该模型揭示了域的选择（例如实数域与有限域）以及矩阵表示形式（二分矩阵与关联矩阵）可能导致查询复杂度出现指数级或多项式级差异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。