QUICK REVIEW

[论文解读] Nethira: A Heterogeneity-aware Hierarchical Pre-trained Model for Network Traffic Classification

Chungang Lin, Weiyao Zhang|arXiv (Cornell University)|Jan 30, 2026

Internet Traffic Analysis and Secure E-voting被引用 0

一句话总结

Nethira 引入了一种面向异质性的层级预训练模型，通过多级重建和一致性正则化微调，在有限标注数据下实现卓越性能。

ABSTRACT

Network traffic classification is vital for network security and management. The pre-training technology has shown promise by learning general traffic representations from raw byte sequences, thereby reducing reliance on labeled data. However, existing pre-trained models struggle with the gap between traffic heterogeneity (i.e., hierarchical traffic structures) and input homogeneity (i.e., flattened byte sequences). To address this gap, we propose Nethira, a heterogeneity-aware pre-trained model based on hierarchical reconstruction and augmentation. In pre-training, Nethira introduces hierarchical reconstruction at multiple levels-byte, protocol, and packet-capturing comprehensive traffic structural information. During fine-tuning, Nethira proposes a consistency-regularized strategy with hierarchical traffic augmentation to reduce label dependence. Experiments on four public datasets demonstrate that Nethira outperforms seven existing pre-trained models, achieving an average F1-score improvement of 9.11%, and reaching comparable performance with only 1% labeled data on high-heterogeneity network tasks.

研究动机与目标

在同质输入形式下仍然存在流量异质性时，推动改进的网络流量分类。
开发一个能够捕捉层级流量结构（字节、协议、数据包）的预训练任务。
提出带有层级增强和一致性正则化的微调策略。
证明层级预训练结合增强在多数据集和不同数据-标签情形下具有更优的性能。

提出的方法

将原始流量转换为扁平化的字节序列作为模型输入。
在字节、协议和数据包层级进行层级重建，使用 Transformer 编解码器进行预训练。
使用三种重建损失：字节级、协议级和数据包级来引导表征学习（L_byte、L_protocol、L_packet）。
预训练目标 L_P 等于三种重建损失之和（L_byte + L_protocol + L_packet）。
通过带有一致性正则化的多层级流量增强进行微调（协议级和数据包级），以在异质输入上实现稳定预测（L_sup + lambda * L_cons）。

实验结果

研究问题

RQ1层级重建是否能捕捉超越扁平字节表示的流量异质性？
RQ2带有一致性正则化的层级增强是否改善对异质流量任务的泛化？
RQ3Nethira 相较于现有的预训练模型在多个公开数据集上的表现如何？
RQ4使用有限标注数据（如 1%–10%）时的数据效率提升是多少？

主要发现

Method	ISCX-VPN(App) PR	ISCX-VPN(App) RC	ISCX-VPN(App) F1	ISCX-VPN(Service) PR	ISCX-VPN(Service) RC	ISCX-VPN(Service) F1	USTC-TFC PR	USTC-TFC RC	USTC-TFC F1	CIC-IoT PR	CIC-IoT RC	CIC-IoT F1	Avg. F1
FlowPrint	59.04	43.04	44.94	70.21	66.62	64.51	69.76	70.16	68.81	14.73	20.46	15.70	48.49
AppScanner	72.89	53.61	58.03	85.99	75.67	79.13	75.58	57.72	62.77	35.27	23.86	25.45	56.35
FS-Net	49.90	39.96	40.60	71.61	63.63	64.18	90.74	89.66	89.39	37.24	35.39	32.61	56.70
EBSNN	66.07	61.53	62.05	89.84	89.69	89.53	93.48	91.29	90.10	88.92	87.29	85.37	81.76
TFE-GNN	67.20	60.60	61.80	85.97	80.95	82.14	95.91	95.68	95.63	67.05	66.90	64.29	75.97
NetMamba	67.17	58.05	60.32	86.01	78.31	80.27	95.85	94.90	94.83	68.18	70.39	67.55	75.74
YaTC	70.03	58.73	62.33	81.06	78.37	78.06	95.77	94.96	94.87	74.28	75.07	72.36	76.91
PERT	72.16	70.26	70.80	91.42	90.43	90.86	93.24	93.00	92.95	89.58	89.47	88.23	85.71
NetGPT	69.86	71.48	69.40	91.94	92.20	91.92	96.16	95.98	96.00	90.48	90.19	89.08	86.60
ET-BERT	72.00	70.36	70.94	91.40	91.58	91.47	95.21	95.20	95.18	91.29	89.93	88.91	86.63
TraGe	71.38	71.10	70.93	91.75	91.72	91.68	95.94	95.90	95.91	89.02	90.04	88.61	86.78
TrafficFormer	72.32	71.56	71.69	92.15	91.94	91.97	95.17	94.98	95.01	91.25	90.10	89.12	86.95
Nethira	77.33	74.58	75.55	92.35	92.44	92.34	96.62	96.42	96.40	97.26	97.40	97.29	90.40

Nethira 在七个预训练基线之上实现平均 F1 提升 9.11%。
在四个数据集上，Nethira 相对于基线在 F1 上分别提升 11.49%（App）、5.36%（Service）、1.52%（USTC-TFC）、以及 18.05%（CIC-IoT）。
仅用 1% 标注数据，Nethira 达到 CIC-IoT 的 F1 为 0.9452，达到或超过一些使用全标注训练的模型。
消融实验显示若不进行层级重建的预训练，性能下降 4.78%；仅使用 L_byte 时下降 1.71%；无增强的微调下降 7.84%。
在有限标签条件下，Nethira 的 CIC-IoT 数据集显示出显著提升，原因是数据包层级异质性较高（与 ANPF 相关效应）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。