Verification of Simulations

Meteorological variables
Atmospheric models
Data availability
Spatial and temporal verification 

We conduct extensive and regular verification of our and other simulation models, comparing them to actual measurement and observation data. Thereby, we ensure that our services are delivering top-quality (and continuously improving) weather data, both historic and forecast. meteoblue is the first commercial weather service that regularly publishes verification data on the company website since 2010, as well as daily local accuracy updates. 

Why do we publish our verifications?

  1. We are transparent: weather is no "chaos" and our customers should know what they receive
  2. We deliver quality: our accuracy is so high that it is worth showing.
  3. We are realistic: you should know what to expect from a forecast - and what not to expect.
  4. We are competitive: if someone believes we are not good enough - show us how to do better.

What does meteoblue simulation quality mean? This page and subpages show some of the most important studies.

Verification improved the last decades

Numerical weather forecast models have been continuously improved in the last decades. Around 1980, the 24-hour ahead forecast of the air temperature was calculated with an accuracy of around 70 %. In 2018, the accuracy of the 24h forecast increased to around 90% and the 72h forecast nowadays is as good as the 24h forecast was 40 years ago. In numerical weather forecast models, the accuracy of the 500 hPa geopotential height is even higher than the accuracy of the 2 m air temperature simulation. The evolution of the model accuracies over time can be seen in the following figure (Source: ECMWF).

Evolution of the forecast skill [%] of the 500hPa geopotential height from 1980-2013. (Source: ECMWF, 2013) 

Three main factors are responsible for the increasing model accuracy during the last 40 years:

  1. The initial conditions of the numerical weather forecast model are estimated significantly better than 40 years ago. New meteorological measurement techniques (e.g. satellite observations) and more accurate measurements are responsible for this improvement.
  2. Finer horizontal (and vertical) resolution of the numerical weather forecast models due to more computational power.
  3. Better sub-grid parametrizations in the numerical models than 40 years ago.

The accuracy of a weather simulation model significantly depends on the chosen meteorological variable. Meteorological variables like the 2 m air temperature, surface pressure or the 500hPa geopotential height are typically calculated with high accuracy, whereas other variables (e.g. precipitation, wind gusts, etc.) have a lower accuracy, typically caused by small-scale spatial variations, which are not resolved in weather models.

meteoblue verification for historical and forecast data

In the following, we show the meteoblue model accuracy for different meteorological variables and the model skill of meteoblue multimodels, MOS, reanalysis models and stand-alone ("raw") numerical weather forecast models.

The verification of numerical weather forecast models is highly relevant for all stakeholders in order to show that weather forecast models have a larger model skill than simple climatological forecasts or persistence forecasts ("Weather tomorrow is the same as today"). 

Scope

Four different meteorological variables (air temperature, wind speed, precipitation and dewpoint temperature) have been verified on more than 10'000 different meteorological stations worldwide during the year 2017, by analysing the model accuracy of several different raw ("stand-alone") weather forecast models, satellite observations and reanalysis models. Additionally, the model accuracy of different multimodel approaches was tested and compared against raw ("stand-alone") models and a 24-hour ahead forecast from model output statistics (MOS).
We distinguish between historical data sets and forecast data sets, based on the availability of the model data. 

Summary

The following table shows the MAE (Mean absolute error in K) on an hourly basis (and yearly basis for Annual precipitation), determined for each method and variable on 10'000 weather stations globally for the year 2017.

Comparison of the mean absolute error (MAE) for four different meteorological parameters for more than 10'000 weather stations worldwide. The analysis was conducted based on measurements recorded in 2017.
  Model approach Air temperature Wind speed Annual precipitation Dewpoint temperature
Forecast meteoblue learning multimodel 1.2 K - 170 mm -
MOS 1.5 K 1.2 m s-1 - 1.7 K
Weather forecast models 1.7 - 2.2 K 1.5 - 1.7 m s-1 220 - 230 mm 1.9 - 2.4 K
History Real-time updates (NEMS30) 2.1 K 1.7 m s-1 220 mm 2.2 K
  Reanalysis model 1.5 K 1.5 m s-1 120 - 180 mm 1.6 K

Recommendations

From an accuracy perspective, we recommend the following sources for spatial (worldwide) weather data:

  Reanalysis History Forecast
Air temperature ERA5 NEMS local, NEMS30 meteoblue learning multi-model (MLM)
Wind Speed ERA5 NEMS local, NEMS30 meteoblue MOS and meteoblue model mix
Precipitation (daily events)

ERA5 (all precipitation events)

CMORPH (heavy precipitation events)

NEMS local, NEMS30 meteoblue learning multi-model (MLM)
Precipitation (annual sums) historical meteoblue model mix historical meteoblue model mix meteoblue learning multi-model (MLM)
Dewpoint temperature ERA5 NEMS local, NEMS30 meteoblue MOS and meteoblue model mix

For historical analysis, reanalysis models offer the highest accuracy, but they are only available with a time lag of 2-5 days (CMORPH) to 2-3 months (ERA5), and do not yet cover 20 years. For applications with require realtime updates, consistency and time extension over 30 years and multiple variables, NEMS30 is the only solution currently available.

More information about the verification of historical and forecast data can be found below:

Meteorological variables

Air temperature

The 2 m air temperature is best calculated by the meteoblue learning multimodel (MLM) with values of MAE = 1.2 K. The MOS air temperature forecast gives the same accuracy as the reanalysis model ERA5 (MAE = 1.5 K), which is recommended for historical data sets. The ‘stand-alone’ (RAW) global weather forecast models perform in the range between 1.7 and 2.2 K. Hence, the 6-day forecast of the meteoblue multi-model is as good as the 1-day forecast of a ‘stand-alone’ (RAW) numerical weather forecast model. 

Wind speed

The model uncertainty of the forecasted 10 m wind speed is within 1.5 – 1.7 m s-1 by using ‘stand-alone’ weather forecast models and for historical data 1.5 m s-1 by using the reanalysis model ERA5. The model error could be reduced to 1.2 m s-1 for model simulations with MOS. 

 

Radiation

meteoblue calculates radiation for the land and sea surface and for atmospheric layers, both as incoming direct and indirect sunlight, as well as reflected radiation from clouds or surface. The meteoblue simulations for global surface radiation is consistent over continents and reaches a monthly mean absolute error of 1-15% in 95% of all places.

Precipitation

The model skill of daily precipitation events decreases with increasing precipitation intensity. Numerical weather forecast models are the best source for detection of small precipitation events. For heavy precipitation events, the model skill of satellite observations is larger than those of numerical weather forecast models. The model skill could not be increased by mixing two (or more) models for daily precipitation events.
For historical data, annual precipitation sums are calculated best by using satellite observations from CHIRPS2, which are bias corrected with the same measurement data set used for verification in this study. The model accuracy of CHIRPS2 in regions without measurement stations is therefore expected to be significantly lower and to a certain extent unknown. 

Dewpoint temperature

The model accuracy for the dewpoint temperature is slightly lower than the model accuracy for the air temperature. MAE values are between 1.9 - 2.4 K for numerical weather forecast models and 1.6 K for a reanalysis model. The accuracy of model simulations with MOS are in a similar range to those of the reanalysis model.

Verification studies

A comprehensive verification study of air temperature, wind speed, precipitation and dewpoint temperature conducted over more than 10'000 meteorological stations worldwide for the year 2017 can be downloaded below:

meteoblue verification global Summary 2017 EN 20181113z10.pdf (4.23 MB)

A verification study was done for Europe in 2011, comparing the model accuracy of weather models with 40, 12 and 3 km spatial resolution, for the air temperature and wind speed by using MOS and raw models. The study can be downloaded here:

meteoblue_NMM_validation_mueller_2011.pdf (1.60 MB)