[Paper Review] Deep Hyperspherical Learning
SphereNet replaces inner-product convolution with SphereConv on hyperspheres and uses angular GA-Softmax losses, improving training stability, convergence speed, and accuracy across networks.
Convolution as inner product has been the founding basis of convolutional neural networks (CNNs) and the key to end-to-end visual representation learning. Benefiting from deeper architectures, recent CNNs have demonstrated increasingly strong representation abilities. Despite such improvement, the increased depth and larger parameter space have also led to challenges in properly training a network. In light of such challenges, we propose hyperspherical convolution (SphereConv), a novel learning framework that gives angular representations on hyperspheres. We introduce SphereNet, deep hyperspherical convolution networks that are distinct from conventional inner product based convolutional networks. In particular, SphereNet adopts SphereConv as its basic convolution operator and is supervised by generalized angular softmax loss - a natural loss formulation under SphereConv. We show that SphereNet can effectively encode discriminative representation and alleviate training difficulty, leading to easier optimization, faster convergence and comparable (even better) classification accuracy over convolutional counterparts. We also provide some theoretical insights for the advantages of learning on hyperspheres. In addition, we introduce the learnable SphereConv, i.e., a natural improvement over prefixed SphereConv, and SphereNorm, i.e., hyperspherical learning as a normalization method. Experiments have verified our conclusions.
Motivation & Objective
- Address training difficulties in deep CNNs caused by depth and large parameter spaces.
- Propose hyperspherical convolution (SphereConv) and angular supervision to improve optimization and generalization.
- Develop SphereNet variants including learnable SphereConv and SphereNorm.
- Demonstrate improved convergence and competitive/class-leading accuracy on CIFAR and large-scale datasets like ImageNet.
Proposed method
- Define SphereConv as a cosine-like angular similarity on a unit hypersphere with three instances: linear, cosine, and sigmoid (and a learnable variant).
- Replace standard convolution with SphereConv and supervise with generalized angular softmax (GA-Softmax) losses (including W-Softmax as a special case).
- Provide theoretical insights showing improved conditioning for optimization on spheres (and avoid weight-norm sensitivity).
- Extend SphereConv to fully connected layers and existing architectures (e.g., VGG, GoogLeNet, ResNet) with SphereNorm as a complementary normalization.
- Discuss training strategies, back-propagation for SphereConv, and regularization via approximate orthogonality of kernels.
Experimental results
Research questions
- RQ1Does learning on hyperspheres improve conditioning and optimization speed in deep networks?
- RQ2Do SphereConv and angular losses consistently outperform traditional inner-product convolutions across architectures and datasets?
- RQ3How do different SphereConv variants (linear, cosine, sigmoid) and GA-Softmax losses compare in accuracy and training stability?
- RQ4Can SphereConv function effectively as normalization (SphereNorm) and enable learnable parameters for further gains?
Key findings
- SphereConv operators consistently outperform original convolution across architectures and loss choices.
- Sigmoid SphereConv with a suitably chosen parameter often yields the best accuracy among tested variants.
- SphereNet achieves faster convergence and greater stability, enabling training of very deep plain networks without residual shortcuts.
- Learnable SphereConv further boosts performance, indicating layer-wise adaptation of angular parameters is beneficial.
- SphereNorm complements BatchNorm and can enhance performance when used together.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.