[Paper Review] Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline.
This paper proposes Omni-Scale 1D-CNN (OS-CNN), a novel architecture that dynamically learns optimal kernel sizes for time series classification without grid search. By strategically configuring a small set of kernel sizes to cover diverse receptive fields, OS-CNN achieves superior performance across 85 UCR datasets, outperforming existing 1D-CNN baselines in accuracy, wins, and statistical significance.
For time series classification task using 1D-CNN, the selection of kernel size is critically important to ensure the model can capture the right scale salient signal from a long time-series. Most of the existing work on 1D-CNN treats the kernel size as a hyper-parameter and tries to find the proper kernel size through a grid search which is time-consuming and is inefficient. This paper theoretically analyses how kernel size impacts the performance of 1D-CNN. Considering the importance of kernel size, we propose a novel Omni-Scale 1D-CNN (OS-CNN) architecture to capture the proper kernel size during the model learning period. A specific design for kernel size configuration is developed which enables us to assemble very few kernel-size options to represent more receptive fields. The proposed OS-CNN method is evaluated using the UCR archive with 85 datasets. The experiment results demonstrate that our method is a stronger baseline in multiple performance indicators, including the critical difference diagram, counts of wins, and average accuracy. We also published the experimental source codes at GitHub (this https URL).
Motivation & Objective
- To address the inefficiency and suboptimal performance of grid search for kernel size selection in 1D-CNNs for time series classification.
- To theoretically analyze the impact of kernel size on 1D-CNN performance in capturing salient temporal patterns.
- To develop a novel architecture that enables 1D-CNN to learn the optimal receptive field scale during training, reducing reliance on manual hyperparameter tuning.
- To establish a stronger baseline for 1D-CNN in time series classification by improving model capacity through intelligent kernel size configuration.
Proposed method
- Propose a novel Omni-Scale 1D-CNN (OS-CNN) architecture that integrates multiple kernel sizes in a single, unified convolutional block.
- Design a specific kernel size configuration strategy using a minimal set of carefully selected kernel sizes to cover a wide range of receptive fields.
- Integrate the multi-scale kernels into a single convolutional layer to enable end-to-end learning of the most relevant scale during training.
- Use a shared feature map across different kernel sizes to maintain computational efficiency while enhancing representational capacity.
- Train the model end-to-end with standard optimization to allow the network to automatically learn the most effective kernel size for each time series pattern.
- Evaluate the model on the full UCR Time Series Archive (85 datasets) using standard metrics including accuracy, critical difference diagrams, and win counts.
Experimental results
Research questions
- RQ1How does kernel size selection impact the performance of 1D-CNN in time series classification tasks?
- RQ2Can a 1D-CNN model learn the optimal kernel size dynamically during training without relying on grid search?
- RQ3What is the minimal set of kernel sizes needed to effectively cover diverse receptive fields in time series data?
- RQ4How does the proposed OS-CNN architecture compare to standard 1D-CNN baselines in terms of accuracy and robustness across diverse time series datasets?
- RQ5Does the proposed method serve as a stronger baseline than existing 1D-CNN approaches in time series classification?
Key findings
- OS-CNN achieves higher average accuracy than existing 1D-CNN baselines across the 85 UCR datasets.
- The proposed method records more wins than baseline models in pairwise comparisons across the UCR benchmark.
- OS-CNN demonstrates superior performance in critical difference diagrams, indicating statistically significant advantages over standard 1D-CNNs.
- The model's performance is robust across diverse time series data, confirming the effectiveness of the designed kernel size configuration.
- The method reduces the need for time-consuming grid search by learning optimal kernel sizes end-to-end, improving training efficiency.
- The source code is publicly available on GitHub, enabling reproducibility and further benchmarking.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.