[Paper Review] Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast
Pangu-Weather presents a 3D Earth-Specific Transformer (3DEST) and hierarchical temporal aggregation to deliver fast, high-resolution global weather forecasts that surpass traditional NWP methods in accuracy.
In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast. For this purpose, we establish a data-driven environment by downloading $43$ years of hourly global weather data from the 5th generation of ECMWF reanalysis (ERA5) data and train a few deep neural networks with about $256$ million parameters in total. The spatial resolution of forecast is $0.25^\circ imes0.25^\circ$, comparable to the ECMWF Integrated Forecast Systems (IFS). More importantly, for the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy (latitude-weighted RMSE and ACC) of all factors (e.g., geopotential, specific humidity, wind speed, temperature, etc.) and in all time ranges (from one hour to one week). There are two key strategies to improve the prediction accuracy: (i) designing a 3D Earth Specific Transformer (3DEST) architecture that formulates the height (pressure level) information into cubic data, and (ii) applying a hierarchical temporal aggregation algorithm to alleviate cumulative forecast errors. In deterministic forecast, Pangu-Weather shows great advantages for short to medium-range forecast (i.e., forecast time ranges from one hour to one week). Pangu-Weather supports a wide range of downstream forecast scenarios, including extreme weather forecast (e.g., tropical cyclone tracking) and large-member ensemble forecast in real-time. Pangu-Weather not only ends the debate on whether AI-based methods can surpass conventional NWP methods, but also reveals novel directions for improving deep learning weather forecast systems.
Motivation & Objective
- Establish a data-driven global weather forecasting framework using ERA5 reanalysis data.
- Develop a 3D model that integrates height information (pressure levels) into inputs and outputs.
- Improve medium-range forecast accuracy with a hierarchical temporal aggregation strategy.
- Demonstrate applicability to extreme weather forecasting and large-member ensembles.
- Show that AI-based forecasting can outperform conventional NWP at high spatial resolution.
Proposed method
- Use ERA5 data (1979–2017 for training, 2019 for validation, 2018/2020/2021 for testing) at 0.25°×0.25° resolution with 13 pressure levels and 4 surface variables.
- Introduce the 3D Earth-Specific Transformer (3DEST) that processes 3D weather states and uses patch embedding/recovery.
- Incorporate Earth-specific positional bias to account for latitude and height variations in attention mechanisms.
- Apply a Swin-transformer inspired shifted-window attention to manage computational costs while modeling 3D data.
- Implement hierarchical temporal aggregation by training separate models for 1h, 3h, 6h, and 24h leads to reduce iterative forecast errors.
- Train and deploy on a Huawei Cloud GPU cluster with 192 NVIDIA Tesla-V100 GPUs, reporting inference times around 1,400 ms per forecast on a single GPU.
Experimental results
Research questions
- RQ1Can AI-based methods surpass state-of-the-art NWP in accuracy across multiple forecast lead times and factors (geopotential, humidity, wind, temperature, etc.)?
- RQ2Does incorporating 3D (height) information and Earth-aware biases improve AI-based weather forecasts compared to 2D approaches?
- RQ3Can hierarchical temporal aggregation reduce cumulative forecast errors in medium-range forecasts and enable reliable extreme weather and ensemble forecasting?
- RQ4What are the practical computation costs and scalability of large 3D transformer-based weather models on multi-GPU clusters?
Key findings
- Pangu-Weather achieves higher accuracy than operational IFS and FourCastNet for all factors and lead times from 1 hour to 1 week (e.g., RMSE for 5-day Z500 = 296.7 for a single forecast).
- Inference cost is about 1,400 ms on a single GPU, orders of magnitude faster than traditional NWP systems.
- A 0.25°×0.25° spatial resolution is attained, with 13 pressure levels and 5 upper-air/4 surface variables, enabling high-fidelity global forecasts.
- The 3D Earth-Specific Transformer (3DEST) with Earth-specific positional bias effectively captures height and latitude-related patterns, improving forecast accuracy.
- Hierarchical temporal aggregation reduces cumulative forecast errors by using models with lead times of 1h, 3h, 6h, and 24h, improving medium-range forecast reliability.
- Pangu-Weather demonstrates transferability to extreme weather forecasting and large-member ensemble scenarios with fast inference.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.