[论文解读] Hybrid Feature Learning with Time Series Embeddings for Equipment Anomaly Prediction
该论文提出一个混合框架,将64维 Granite TinyTimeMixer 时序嵌入(通过 LoRA 微调)与28个领域知情的统计特征融合后输入 LightGBM,用于多阶段 HVAC 异常预测,几乎达到完美的 ROC-AUC 且误报很低。
In predictive maintenance of equipment, deep learning-based time series anomaly detection has garnered significant attention; however, pure deep learning approaches often fail to achieve sufficient accuracy on real-world data. This study proposes a hybrid approach that integrates 64-dimensional time series embeddings from Granite TinyTimeMixer with 28-dimensional statistical features based on domain knowledge for HVAC equipment anomaly prediction tasks. Specifically, we combine time series embeddings extracted from a Granite TinyTimeMixer encoder fine-tuned with LoRA (Low-Rank Adaptation) and 28 types of statistical features including trend, volatility, and drawdown indicators, which are then learned using a LightGBM gradient boosting classifier. In experiments using 64 equipment units and 51,564 samples, we achieved Precision of 91--95\% and ROC-AUC of 0.995 for anomaly prediction at 30-day, 60-day, and 90-day horizons. Furthermore, we achieved production-ready performance with a false positive rate of 1.1\% or less and a detection rate of 88--94\%, demonstrating the effectiveness of the system for predictive maintenance applications. This work demonstrates that practical anomaly detection systems can be realized by leveraging the complementary strengths between deep learning's representation learning capabilities and statistical feature engineering.
研究动机与目标
- Identify why pure deep learning struggles on real-world HVAC anomaly data.
- Propose a hybrid architecture combining time-series embeddings with statistical features.
- Show that gradient-boosted trees on hybrid features yields production-ready performance.
- Demonstrate multi-horizon anomaly prediction at 30/60/90 days.
提出的方法
- Use Granite TinyTimeMixer encoder to extract 64-dim time-series embeddings from 90-day windows (LoRA fine-tuned with rank r=16).
- Compute 28 domain-informed statistical features (trend, volatility, drawdown, basic statistics).
- Concatenate embeddings and statistics to form a 92-dim hybrid feature vector.
- Train three LightGBM classifiers (one per horizon) with early stopping and class weighting, on the 92-dim features.
- Evaluate using precision, recall, F1, ROC-AUC, and false positive rate across 30/60/90-day horizons.
实验结果
研究问题
- RQ1Can a hybrid of time-series embeddings and statistical features outperform pure deep learning and pure statistical baselines for industrial anomaly detection?
- RQ2How does the hybrid approach perform across multiple prediction horizons (30, 60, 90 days) in terms of precision, recall, and ROC-AUC?
- RQ3Is the system production-ready in terms of false positives and detection rate for HVAC equipment?
- RQ4What is the relative importance of embedding versus statistical features in the final decision?
主要发现
| 模型 | 时域 horizonte | Precision | Recall | F1-Score | ROC-AUC | FPR (%) |
|---|---|---|---|---|---|---|
| Granite TS (v1.0) | 30d | 0.10 | 0.77 | 0.18 | 0.54 | 98.9 |
| Granite TS (v1.0) | 60d | 0.09 | 0.95 | 0.17 | 0.48 | 99.1 |
| Granite TS (v1.0) | 90d | 0.11 | 0.47 | 0.18 | 0.52 | 98.9 |
| LightGBM (statistical) | 30d | 0.79 | 0.85 | 0.82 | 0.987 | 2.6 |
| LightGBM (statistical) | 60d | 0.81 | 0.85 | 0.83 | 0.989 | 2.4 |
| LightGBM (statistical) | 90d | 0.87 | 0.78 | 0.82 | 0.987 | 1.9 |
| Hybrid (proposed) | 30d | 0.91 | 0.94 | 0.92 | 0.995 | 0.6 |
| Hybrid (proposed) | 60d | 0.93 | 0.94 | 0.93 | 0.995 | 0.5 |
| Hybrid (proposed) | 90d | 0.95 | 0.88 | 0.91 | 0.995 | 1.1 |
- Hybrid model delivers 91–95% precision across horizons with 0.5–1.1% FPR and ROC-AUC of 0.995.
- Pure Granite TinyTimeMixer baseline achieves only 9–11% precision with very high FPR (~99%).
- Statistical features alone reach 79–87% precision and FPR ≤ 2.6%.
- Hybrid approach outperforms both baselines by large margins (≈810–933% relative improvement over pure DL).
- Feature importance shows time-series embeddings contribute ~48.5% and volatility features ~29.3% of predictive power across horizons; horizon-length shifts emphasize volatility for longer horizons.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。