QUICK REVIEW

[논문 리뷰] Rethinking the Harmonic Loss via Non-Euclidean Distance Layers

Maxwell Miller-Golub, Collin Coil|arXiv (Cornell University)|2026. 03. 10.

Explainable Artificial Intelligence (XAI)인용 수 0

한 줄 요약

본 논문은 최종 분류층에서 유클리드 거리를 비유클리드 거리 스펙트럼으로 대체하여 하모닉 로스를 확장하고, 비전과 언어 태스크 전반에서 성능, 해석가능성, 지속가능성을 평가한다.

ABSTRACT

Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is a distance-based alternative grounded in Euclidean geometry that improves interpretability and mitigates phenomena such as grokking, or delayed generalization on the test set. However, the study of harmonic loss remains narrow: only Euclidean distance is explored, and no systematic evaluation of computational efficiency or sustainability was conducted. We extend harmonic loss by systematically investigating a broad spectrum of distance metrics as replacements for the Euclidean distance. We comprehensively evaluate distance-tailored harmonic losses on both vision backbones and large language models. Our analysis is framed around a three-way evaluation of model performance, interpretability, and sustainability. On vision tasks, cosine distances provide the most favorable trade-off, consistently improving accuracy while lowering carbon emissions, whereas Bray-Curtis and Mahalanobis further enhance interpretability at varying efficiency costs. On language models, cosine-based harmonic losses improve gradient and learning stability, strengthen representation structure, and reduce emissions relative to cross-entropy and Euclidean heads. Our code is available at: https://anonymous.4open.science/r/rethinking-harmonic-loss-5BAB/.

연구 동기 및 목표

하모닉 로스에서 유클리드 거리를 넘어서는 대체 거리 메트릭의 동기를 제시한다.
하모닉 로스 프레임워크에서 광범위한 거리 측정치를 체계적으로 평가한다.
다양한 도메인에 걸쳐 모델 성능, 표현의 해석가능성, 에너지 효율성을 평가한다.
다양한 거리의 기하학적 의미와 수렴에 대한 이론적 시사점을 제공한다.

제안 방법

하모닉 로스에서 유클리드 거리를 Manhattan, Chebyshev, Minkowski, 코사인, Hamming, Canberra, Bray-Curtis, Mahalanobis를 포함하는 집합의 선택된 거리 d(·,·)로 대체한다.
백본과 함께 분류 헤드를 드롭인 방식으로 최적화하여 클래스 프로토타입(가중치 벡터)을 학습한다.
일관된 학습 프로토콜 하에서 비전(MNIST, CIFAR-10/100, MarathiSignLanguage, TinyImageNet) 및 언어(OpenWebText with GPT/BERT/Qwen-style decoders) 태스크에 대해 평가한다.
세 가지 측면을 분석한다: 모델 성능(정확도/ F1, perplexity), 해석가능성( PCA 지표를 통한 임베딩 기하학), 지속가능성(학습 시간, GFLOPs, 배출량).
1-동차(distances)에서 규모 불변성과 한정된 미니마를 보이는 이론적 결과와 PAC-Bayes 일반화 경계에 대한 결과를 제공한다.

Figure 7. Vision: Emissions Averaged Across Seeds and Aggregated Over all 12 Model Backbones.

실험 결과

연구 질문

RQ1RQ1: 비유클리드 하모닉 로스가 교차 엔트로피 및 유클리드 하모닉 로스보다 더 높은 정확도나 더 빠른 수렴을 보이는가?
RQ2RQ2: 이러한 로스가 교차 엔트로피보다 해석가능한 표현을 생성하는가?
RQ3RQ3: 성능 향상이 더 높은 계산 비용을 수반하는가, 아니면 비슷하거나 더 낮은 에너지 소비로 달성할 수 있는가?
RQ4다양한 거리 선택이 학습된 프로토타입과 특징 공간의 기하학에 어떤 영향을 미치는가?

주요 결과

코사인 기반 하모닉 로스가 비전 태스크 전반에서 가장 안정적인 전반적 성능을 제공하며, 경쟁력 있는 정확도와 배출이 감소하거나 중립적인 수준을 보인다.
Bray-Curtis 및 Chebyshev 거리는 임베딩 구조를 증가시키고 90% 분산 설명에 필요한 차원 수를 감소시켜 해석가능성을 높인다.
Mahalanobis 거리는 강한 표현 명확성을 보여주나 계산 비용이 더 높고 때로는 복잡한 데이터에서 최적화가 더 불안정하다.
언어 모델에서 코사인 기반 및 Minkowski 거리는 그래디언트 안정성과 표현 구조를 개선하는 경향이 있으며 지속가능성 프로파일도 우수한 편이다.
전반적으로 비유클리드 거리는 정확도, 해석가능성, 지속가능성 사이의 삼중적 트레이드오프를 제공하며, 코사인은 일반적으로 최적의 균형을 제공한다.

Figure 8. Loss convergence behavior with PVT and ResNet50: Training and Validation loss across all datasets with different non-Euclidean harmonic losses.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.