Verification of Simulations

We conduct extensive and regular verification of our and other simulation models, comparing them to actual measurement and observation data. Thereby, we ensure that our services are delivering top-quality (and continuously improving) weather data, both historic and forecast.
meteoblue is the first commercial weather service that regularly publishes verification data on the company website since 2010, as well as daily local accuracy updates .
Why do we publish our verifications?

  1.     We are transparent: weather is no "chaos" and our customers should know what they receive.
  2.     We deliver quality: our accuracy is so high that it is worth showing.
  3.     We are realistic: you should know what to expect from a forecast - and what not to expect.
  4.     We are competitive: if someone believes we are not good enough - show us how to do better.

What does meteoblue simulation quality mean? This page and subpages show some of the most important studies.

Verification for forecast and historical weather data

Numerical weather forecast models have been continuously improved in the last decades. Around 1990, the 24-hour ahead (24h) forecast of the air temperature was calcuated with an accuracy of around 70 %. In 2018, the accuracy of the 24h forecast increased to around 90% and the 72h forecast nowadays  is as good as the 24h forecast was 30 years ago.
In numerical weather forecast models, the accuracy of the 500 hPa geopotential height is even higher than the accuracy of the 2 m air temperature simulation. The evolution of the model accuracies over time can be seen in the following figure (Source: ECMWF).

Evolution of the forecast skill [%] of the 500hPa geopotential height from 1980-2013. (Source: ECMWF, 2013) 

Three main factors are responsible for the increasing model accuracy during the last 40 years:

  1. Initial conditions: the initial conditions of the numerical weather forecast model are estimated significantly better than 40 years ago. New meteorological measurement techniques (e.g. satellite observations) and more accurate measurements are responsible for this improvement.
  2. Finer horizontal (and vertical) resolution of the numerical weather forecast models due to more computational power.
  3. Better sub-grid parametrizations in the numerical models than 40 years ago.

The accuracy of a weather simulation model depends significantly on the chosen meteorological variable. Meteorological variables like the 2 m air temperature, surface pressure or the 500hPa geopotential height are typically calculated with high accuracy, whereas other variables (e.g. precipitation, wind gusts, etc.) have a lower accuracy, typically caused by small-scale spatial variations, which are not resolved in weather models.

meteoblue model accuracy

In the following, we show the meteoblue model accuracy for different meteorological variables and the model skill of meteoblue multi-models, MOS, reanalysis models and stand-alone ("raw") numerical weather forecast models.

The verification of numerical weather forecast models is highly relevant for all stakeholders in order to show that weather forecast models have a larger model skill than simple climatological forecasts or persistence forecasts ("Weather tomorrow is the same as today"). 

Global study in 2017

Scope

Four different meteorological variables (air temperature, wind speed, precipitation and dewpoint temperature) have been verified on more than 10'000 different meteorological stations worldwide during the year 2017, by analysing the model accuracy of several different raw ("stand-alone") weather forecast models, satellite observations and reanalysis models. Additionally, the model accuracy of different multi-model approaches was tested and compared against raw ("stand-alone") models and a 24-hour ahead (24h) forecast from model output statistics (MOS).
We distinguish between historical data sets and forecast data sets, based on the availability of the model data. 

Summary

The following table shows the MAE (Mean absolute error) on an hourly basis (and yearly basis for Annual precipitation), determined for each method and variable on 10'000 weather stations globally for the year 2017.

Comparison of the mean absolute error (MAE) for four different meteorological parameters.
  Model approach Air temperature Wind speed Annual precipitation Dewpoint temperature
Forecast meteoblue multi-model 1.2 K - 170 mm -
MOS 1.5 K 1.2 m s-1 - 1.7 K
Weather forecast models 1.7 - 2.2 K 1.5 - 1.7 m s-1 220 - 230 mm 1.9 - 2.4 K
History Reanalysis model 1.5 K 1.5 m s-1 120 - 180 mm 1.6 K

Air temperature

The 2 m air temperature is best calculated by the meteoblue multi-model with values of MAE = 1.2 K. The MOS air temperature forecast gives the same accuracy as the reanalysis model ERA5 (MAE = 1.5 K), which is recommended for historical data sets. The ‘stand-alone’ (RAW) global weather forecast models perform in the range between 1.7 and 2.2 K. Hence, the 6-day forecast of the meteoblue multi-model is as good as the 1-day forecast of a ‘stand-alone’ (RAW) numerical weather forecast model. 

Wind speed

The model uncertainty of the forecasted 10 m wind speed is within 1.5 – 1.7 m s-1 by using ‘stand-alone’ weather forecast models and for historical data 1.5 m s-1 by using the reanalysis model ERA5. With MOS, the model error could be reduced to 1.2 m s-1 for model simulations with MOS. 

Radiation

meteoblue calculates radiation for the land and sea surface and for atmospheric layers, both as incoming direct and indirect sunlight, as well as reflected radiation from clouds or surface. The meteoblue simulations for global surface radiation is consistent over continents and reaches a monthly mean absolute error of 1-15% in 95% of all places.

Precipitation

The model skill of daily precipitation events decreases with increasing precipitation intensity. Numerical weather forecast models are the best source for detection of small precipitation events. For heavy precipitation events, the model skill of satellite observations is larger than those of numerical weather forecast models. The model skill could not be increased by mixing two (or more) models for daily precipitation events.
For historical data, annual precipitation sums are calculated best by using satellite observations from CHIRPS2, which are bias corrected with the same measurement data set used for verification in this study. The model accuracy of CHIRPS2 in regions without measurement stations is therefore expected to be significantly lower and to a certain extent unknown. 

Dewpoint temperature

The model accuracy for the dewpoint temperature is slightly lower than the model accuracy for the air temperature. MAE values are between 1.9 - 2.4 K for numerical weather forecast models and 1.6 K for a reanalysis model. The accuracy of model simulations with MOS are in a similar range to those of the reanalysis model.

Recommendations

From an accuracy perspective, we recommend the following sources for spatial (worldwide) weather data:

Recommendations for historical analysis and operational forecast configuration
  History Forecast
Air temperature ERA5 meteoblue learning multi-model (MLM)
Wind Speed ERA5 meteoblue MOS and meteoblue model mix
Precipitation (daily events)

ERA5 (all precipitation events)

CMORPH (heavy precipitation events)

meteoblue learning multi-model (MLM)
Precipitation (annual sums) meteoblue historical model mix meteoblue learning multi-model (MLM)
Dewpoint temperature ERA5 meteoblue MOS and meteoblue model mix

A comprehensive verification study of air temperature, wind speed, precipitation and dewpoint temperature conducted over more than 10'000 meteorological stations worldwide for the year 2017 can be downloaded below:

meteoblue verification global Summary 2017 EN 20181113z10.pdf (4.23 MB)

Verification study for Europe (2011)

A verification study was done for Europe in 2011, comparing the model accuracy of weather models with 40, 12 and 3 km spatial resolution, for the air temperature and wind speed by using MOS and raw models . The study can be downloaded here:

meteoblue_NMM_validation_mueller_2011.pdf (1.60 MB)

.