QUICK REVIEW

[论文解读] In-place fast polynomial modular remainder

Jean‐Guillaume Dumas, Bruno Grenet|arXiv (Cornell University)|Feb 27, 2023

Polynomial and algebraic computation被引用 1

一句话总结

本文提出了首个快速、原地计算多项式模余数及相关运算的算法，采用一种新颖的原地模型松弛方法，允许临时修改输入并在之后恢复。通过利用托普利茨矩阵表示和递归约化，作者实现了模余数计算的 O(m/n M(n) log n) 操作复杂度——当 M(n) = Θ(n^{1+ϵ}) 时，与标准复杂度一致，该方法在多项式扩张中的原地模乘法和模累加运算中具有应用价值。

ABSTRACT

We consider the simultaneously fast and in-place computation of the Euclidean polynomial modular remainder $R(X) $ ot\equiv$ A(X) \mod B(X)$ with $A$ and $B$ of respective degrees $n$ and $m $\le$ n$. But fast algorithms for this usually come at the expense of (potentially large) extra temporary space. To remain in-place a further issue is to avoid the storage of the whole quotient $Q(X)$ such that $A=BQ+R$. If the multiplication of two polynomials of degree $k$ can be performed with $M(k)$ operations and $O(k)$ extra space, and if it is allowed to use the input space of $A$ or $B$ for intermediate computations, but putting $A$ and $B$ back to their initial states after the completion of the remainder computation, we here propose an in-place algorithm (that is with its extra required space reduced to $O(1)$ only) using at most $O(n/m M(m)\log(m)$ arithmetic operations, if $\M(m)$ is quasi-linear, or $O(n/m M(m)}$ otherwise. We also propose variants that compute -- still in-place and with the same kind of complexity bounds -- the over-place remainder $A(X) $ ot\equiv$ A(X) \mod B(X)$, the accumulated remainder $R(X) += A(X) \mod B(X)$ and the accumulated modular multiplication $R(X) += A(X)C(X) \mod B(X)$. To achieve this, we develop techniques for Toeplitz matrix operations which output is also part of the input. Fast and in-place accumulating versions are obtained for the latter, and thus for convolutions, and then used for polynomial remaindering. This is realized via further reductions to accumulated polynomial multiplication, for which fast in-place algorithms have recently been developed.

研究动机与目标

设计快速且原地的多项式与矩阵运算算法，其中输出变量被重用于输入，从而克服快速算法中空间复杂度与时间复杂度之间的权衡。
实现在原地计算欧几里得多项式余数 R(X) ≡ A(X) mod B(X)，传统方法需 O(n) 额外空间存储商。
将原地模型扩展为允许临时修改输入，并在计算后恢复，从而支持新型快速、空间高效算法。
提供快速多项式乘法、卷积和模运算的原地变体，包括累加和“过地方”运算。
当 M(n) = Θ(n^{1+ϵ}) 且 ϵ > 0 时，实现与标准非原地算法相当的复杂度界。

提出的方法

放宽原地模型，允许临时修改输入，只要在计算后恢复至原始状态即可。
将多项式运算表示为托普利茨或循环矩阵与向量的乘积，通过递归分解实现高效的原地计算。
使用递归分块和广义卷积，将多项式乘法和余数计算约化为更小子问题上的原地运算。
应用“撤销”技术：在原地执行中间计算，并通过反向操作恢复原始输入，确保正确性。
通过递归分割和商的覆盖重写，将模余数计算约化为短乘积和形式幂级数除法。
利用快速矩阵算法（如类似斯特assen的算法）和原地线性代数例程（如 TRMM、TRSM）作为多项式运算的构建模块。

实验结果

研究问题

RQ1是否可以仅使用 O(1) 额外空间，在原地计算快速多项式模余数，而无需存储完整商？
RQ2能否设计出复杂度与非原地变体相当的原地快速多项式乘法算法？
RQ3能否通过引入输入恢复机制的宽松原地模型，打破快速算法中速度与空间的固有权衡？
RQ4在原地计算多项式除法的余数（而非商）时，复杂度的最小额外开销是多少？
RQ5能否将原地算法扩展至累加运算，如 R += A·C mod B，且不引入额外空间？

主要发现

所提出的原地余数算法执行 O(m/n M(n) log n) 次算术运算，复杂度较非原地版本多出对数因子。
当 M(n) = Θ(n^{1+ϵ}) 且 ϵ > 0 时，原地算法的复杂度界与非原地算法一致，为 O(m/n M(n))。
实现了卡杜布加乘法的原地变体，并表明其性能接近最先进的 NTL 库。
作者首次提出仅计算多项式除法余数的原地算法，避免了对商的存储需求。
该技术支持模乘法的原地累加，包括 R += A·C mod B，且复杂度界保持一致。
推导出托普利茨和循环矩阵运算的“过地方”变体，支持在原地高效求解线性系统和计算矩阵-向量乘积。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。