[论文解读] 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data
tldr: 引入 SE(3)-equivariant 3D steerable CNNs that operate on scalar, vector, and tensor fields, deriving an analytic steerable kernel basis to ensure rotational and translational equivariance, and demonstrates improved performance on amino acid environment prediction and protein structure classification.
We present a convolutional network that is equivariant to rigid body motions. The model uses scalar-, vector-, and tensor fields over 3D Euclidean space to represent data, and equivariant convolutions to map between such representations. These SE(3)-equivariant convolutions utilize kernels which are parameterized as a linear combination of a complete steerable kernel basis, which is derived analytically in this paper. We prove that equivariant convolutions are the most general equivariant linear maps between fields over R^3. Our experimental results confirm the effectiveness of 3D Steerable CNNs for the problem of amino acid propensity prediction and protein structure classification, both of which have inherent SE(3) symmetry.
研究动机与目标
- Motivate and formalize the need for models respecting SE(3) symmetry in volumetric data.
- Develop a theory and practical implementation of SE(3)-equivariant convolutions using steerable kernel bases.
- Show data-efficient improvements on molecular and protein structure tasks.
- Provide discretization strategies and nonlinearities that preserve equivariance.
- Demonstrate that the approach is implementable with minor code changes from standard 3D CNNs.
提出的方法
- Represent data as fields over R^3 comprising scalars, vectors, and tensors.
- Parameterize convolution kernels as linear combinations of steerable basis kernels derived from SE(3) constraints.
- Derive basis kernels from irreducible SO(3) representations and spherical harmonics with radial basis functions.
- Use cross-correlation with rotation-steerable kernels to achieve equivariant linear maps between feature spaces.
- Introduce a gated nonlinearity to preserve equivariance for non-scalar features.
- Provide discretization with angular frequency cutoffs and radial Gaussians to mitigate aliasing; apply low-pass filtering before downsampling to improve performance.
实验结果
研究问题
- RQ1Can SE(3)-equivariant convolutions with steerable kernel bases provide the most general equivariant linear maps between R^3 fields?
- RQ2Do SE(3)-equivariant networks improve performance and data efficiency on tasks with inherent rotational symmetry, such as amino acid environment prediction and protein structure classification?
- RQ3How can one discretize and implement SE(3) steerable kernels in practical 3D CNNs while controlling aliasing?
- RQ4What nonlinearities preserve equivariance for higher-order (non-scalar) features in steerable networks?
主要发现
- 3D Steerable CNNs achieve SE(3) equivariance by expressing convolutions as cross-correlations with rotation-steerable kernels built from irreducible SO(3) representations.
- On amino acid environment prediction, the steerable model attains substantially better accuracy (0.58 test accuracy) than a conventional CNN under the same setup, while using fewer parameters in some configurations.
- In SHREC17 3D shape classification, the approach performs comparably to state-of-the-art with fewer parameters, using voxel-based inputs.
- On the CATH protein architecture task, the steerable network achieves higher accuracy with 100x fewer parameters than a strong 3D CNN baseline, and maintains advantage as training data size is reduced.
- A Tetris-like 3D experiment demonstrates near-ideal rotation generalization (99±2% accuracy) versus a baseline CNN’s 27±7% when rotated, confirming equivariance in practice.
- The paper provides a complete framework showing equivariant kernels, downsampling with anti-aliasing, and a practical path to convert to a standard 3D CNN after training.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。