QUICK REVIEW

[논문 리뷰] A more accurate rational non-commutative algorithm for multiplying 4x4 matrices using 48 multiplications

Jean‐Guillaume Dumas, Clément Pernet|arXiv (Cornell University)|2026. 03. 19.

Tensor decomposition and applications인용 수 0

한 줄 요약

링에서 2의 역원을 가진 링 위에서의 4x4x4:48 행렬 곱셈 알고리즘의 더 정확한 변형을 제안하고, 최대-노름 오차의 지수는 약 2.386에 도달하며 실용적 정확도가 향상된다.

ABSTRACT

We propose a more accurate variant of an algorithm for multiplying 4x4 matrices using 48 multiplications over any ring containing an inverse of 2. This algorithm has an error bound exponent of only log 4 $γ$$\infty$,2 $\approx$ 2.386. It also reaches a better accuracy w.r.t. max-norm in practice, when compared to previously known such fast algorithms. Furthermore, we propose a straight line program of this algorithm, giving a leading constant in its complexity bound of 387 32 n 2+log 4 3 + o n 2+log 4 3 operations over any ring containing an inverse of 2. Introduction: An algorithm to multiply two 4x4 complex-valued matrices requiring only 48 non-commutative multiplications was introduced in [16] 1 using a pipeline of large language models orchestrated by an evolutionary coding agent. A matrix multiplication algorithm with that many non-commutative multiplications is denoted by ___4x4x4:48___ in the sequel. An equivalent variant of the associated tensor decomposition defining this algorithm, but over the rationals (more precisely over any ring containing an inverse of 2), was then given in [8]. Most error analysis of sub-cubic time matrix multiplication algorithms [3, 4, 2, 1, 17] are given in the max-norm setting: bounding the largest output error as a function of the max-norm product of the vectors of input matrix coefficients. In this setting, Strassen's algorithm has shown the best accuracy bound, (proven minimal under some assumptions in [2]). In [6, 8], the authors relaxed this setting by shifting the focus to the 2-norm for input and/or output; that allowed them to propose a ___2x2x2:7___ variant with an improved accuracy bound. Experiments show that this variant performs best even when measuring the max-norm of the error bound. We present in this note a variant of the recent ___4x4x4:48___ algorithm over the rationals (again in the same orbit under De Groot isotropies [10]) that is more numerically accurate w.r.t. max-norm in practice. In particular, our new variant improves on the error bound exponent, from log 2 $γ$ $\infty$,2 $\approx$ 2.577 Consider the product of an M x K matrix A by a K x N matrix B. It is computed by a ___m, k, n___ algorithm represented by the matrices L, R, P applied recursively on ${\ell}$ recursive levels and the resulting m 0 x k 0 by k 0 x n 0 products are performed using an algorithm $β$. Here M = m 0 m ${\ell}$ , K = k 0 k ${\ell}$ and n = n 0 n ${\ell}$ . The accuracy bound below uses any (possibly different) p-norms and q-norms for its left-handside, ___$\bullet$___ p and right-hand side, ___$\bullet$___ q . The associated dual norms, are denoted by ___$\bullet$___ p $\star$ and ___$\bullet$___ q $\star$ respectively. Note that, these are vector norms, hence ___A___ p for matrix A in R mxn denotes ___Vect(A)___ p and is the p-norm of the mn dimensional vector of its coefficients, and not a matrix norm.

연구 동기 및 목표

최대-노름 설정에서 향상된 수치 안정성을 갖는 서브-세제곱 규모의 행렬 곱셈을 고무한다.
1/2를 포함하는 링 위에서 4x4x4:48 알고리즘의 더 나은 정확성을 갖는 변형을 개발한다.
구현을 위한 구체적인 LRP 표현과 직선 선형 프로그램(SLP)을 제공한다.
이론적 오차 경계를 기존 알고리즘과 비교하고 실용적 정확성을 입증한다.
복잡도 경계를 제시하고 대체 기저 변형에 대해 논의한다.]
method:["4x4 행렬 곱셈의 이차조합(바이어리언 맵)을 LRP(L, R, P) 트리플을 사용해 표현한다.","특정 L, R, P를 갖는 4x4x4:48 알고리즘의 새 변형을 도입하여 (p,q)=(2,2) 및 (무한대,2)일 때 2-노름 성장 인자 gamma_{p,q}를 감소시킨다.","L, R, P에 대한 명시적 직선 선형 프로그램(Table 1)과 Hadamard 곱을 포함한 최적화된 산술을 도출한다.","gamma_{p,q} 성장 인자를 사용해 전방 오차를 분석하고 노름에 대한 경계 f_{p,q}를 제공한다.","자세한 복잡도 경계를 제시하고 부록 A에 대체 기저 변형을 제공한다."]
research_questions_stateful_1b_ignored_please_remove_error_字段_
research_questions_stateful_1:

제안 방법

실험 결과

연구 질문

RQ1무슨 최대 달성 가능한 정확도 향상이 가능한가? 1/2를 가진 링 위에서 48개의 곱으로 이루어진 합리적 비가환 4x4 행렬 곱셈에서
RQ2새로운 변형이 (p,q) 노름에서의 전방 오차 경 Bound에 어떻게 영향을 미치며, 특히 (2,2)와 (무한대,2)에 대해 어떤가?
RQ3구현을 위한 명시적 L, R, P 표현 및 결과 직선 선형 프로그램은 무엇인가?
RQ4이 새로운 변형이 기존 4x4x4:48 스킴과 이론적 성장 인자 및 실용적 최대-노름 정확도 측면에서 어떻게 비교되는가?
RQ5대체 기저를 채택했을 때의 복잡도 함의는 무엇인가?

주요 결과

새로운 변형은 지수 log_4 gamma_{infty,2} ≈ 2.386인 전방 오차 경 Bound를 달성한다.
gamma_2,2 성장 인자는 (1+sqrt(2))·64로 감소하여 이전의 4x4x4:48 방식에 비해 정확도 차수를 향상시킨다.
복잡도 경계의 선행 상수는 387/32이고 정교화된 점근 항은 약 12.09375 n^{2+log_4 3} - 11.09375 n^2 연산을 제공한다(1/2를 가진 링에서).
변형은 L, R, P에 대한 명시적 직선 선형 프로그램(SLP)을 제공하여 실용적 구현 및 벤치마킹을 가능하게 한다.
부록 A는 비슷한 정확도와 경 Bound의 더 낮은 상수를 갖는 대체 기저를 제시한다(8n^{2+log_4 3}+o(...)).
실험은 새로운 4x4x4:48 변형이 최대-노름 출력 오차에서 이전 2x2x2:7 및 4x4x4:48 스킴보다 더 정확하다고 나타낸다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.