Verification of Simulations
Spatial and temporal verification
We conduct extensive and regular verification of our and other simulation models, comparing them to actual measurement and observation data. Thereby, we ensure that our services are delivering top-quality (and continuously improving) weather data, both historic and forecast. meteoblue is the first commercial weather service that regularly publishes verification data on the company website since 2010, as well as daily local accuracy updates.
Why do we publish our verifications?
- We are transparent: weather is no "chaos" and our customers should know what they receive
- We deliver quality: our accuracy is so high that it is worth showing.
- We are realistic: you should know what to expect from a forecast - and what not to expect.
- We are competitive: if someone believes we are not good enough - show us how to do better.
What does meteoblue simulation quality mean? This page and subpages show some of the most important studies.
Verification improved the last decades
Numerical weather forecast models have been continuously improved in the last decades. Around 1980, the 24-hour ahead forecast of the air temperature was calculated with an accuracy of around 70 %. In 2018, the accuracy of the 24h forecast increased to around 90% and the 72h forecast nowadays is as good as the 24h forecast was 40 years ago. In numerical weather forecast models, the accuracy of the 500 hPa geopotential height is even higher than the accuracy of the 2 m air temperature simulation. The evolution of the model accuracies over time can be seen in the following figure (Source: ECMWF).
Three main factors are responsible for the increasing model accuracy during the last 40 years:
- The initial conditions of the numerical weather forecast model are estimated significantly better than 40 years ago. New meteorological measurement techniques (e.g. satellite observations) and more accurate measurements are responsible for this improvement.
- Finer horizontal (and vertical) resolution of the numerical weather forecast models due to more computational power.
- Better sub-grid parametrizations in the numerical models than 40 years ago.
The accuracy of a weather simulation model significantly depends on the chosen meteorological variable. Meteorological variables like the 2 m air temperature, surface pressure or the 500hPa geopotential height are typically calculated with high accuracy, whereas other variables (e.g. precipitation, wind gusts, etc.) have a lower accuracy, typically caused by small-scale spatial variations, which are not resolved in weather models.
meteoblue verification for historical and forecast data
In the following, we show the meteoblue model accuracy for different meteorological variables and the model skill of meteoblue multimodels, MOS, reanalysis models and stand-alone ("raw") numerical weather forecast models.
The verification of numerical weather forecast models is highly relevant for all stakeholders in order to show that weather forecast models have a larger model skill than simple climatological forecasts or persistence forecasts ("Weather tomorrow is the same as today").
Four different meteorological variables (air temperature, wind speed, precipitation and dewpoint temperature) have been verified on more than 10'000 different meteorological stations worldwide during the year 2017, by analysing the model accuracy of several different raw ("stand-alone") weather forecast models, satellite observations and reanalysis models. Additionally, the model accuracy of different multimodel approaches was tested and compared against raw ("stand-alone") models and a 24-hour ahead forecast from model output statistics (MOS).
We distinguish between historical data sets and forecast data sets, based on the availability of the model data.
The following table shows the MAE (Mean absolute error in K) on an hourly basis (and yearly basis for Annual precipitation), determined for each method and variable on 10'000 weather stations globally for the year 2017.
|Model approach||Air temperature||Wind speed||Annual precipitation||Dewpoint temperature|
|Forecast||meteoblue learning multimodel||1.2 K||-||170 mm||-|
|MOS||1.5 K||1.2 m s-1||-||1.7 K|
|Weather forecast models||1.7 - 2.2 K||1.5 - 1.7 m s-1||220 - 230 mm||1.9 - 2.4 K|
|History||Real-time updates (NEMS30)||2.1 K||1.7 m s-1||220 mm||2.2 K|
|Reanalysis model||1.5 K||1.5 m s-1||120 - 180 mm||1.6 K|
From an accuracy perspective, we recommend the following sources for spatial (worldwide) weather data:
|Air temperature||ERA5||NEMS local, NEMS30||meteoblue learning multi-model (MLM)|
|Wind Speed||ERA5||NEMS local, NEMS30||meteoblue MOS and meteoblue model mix|
|Precipitation (daily events)||
ERA5 (all precipitation events)
CMORPH (heavy precipitation events)
|NEMS local, NEMS30||meteoblue learning multi-model (MLM)|
|Precipitation (annual sums)||historical meteoblue model mix||historical meteoblue model mix||meteoblue learning multi-model (MLM)|
|Dewpoint temperature||ERA5||NEMS local, NEMS30||meteoblue MOS and meteoblue model mix|
For historical analysis, reanalysis models offer the highest accuracy, but they are only available with a time lag of 2-5 days (CMORPH) to 2-3 months (ERA5), and do not yet cover 20 years. For applications with require realtime updates, consistency and time extension over 30 years and multiple variables, NEMS30 is the only solution currently available.
More information about the verification of historical and forecast data can be found below:
The 2 m air temperature is best calculated by the meteoblue learning multimodel (MLM) with values of MAE = 1.2 K. The MOS air temperature forecast gives the same accuracy as the reanalysis model ERA5 (MAE = 1.5 K), which is recommended for historical data sets. The ‘stand-alone’ (RAW) global weather forecast models perform in the range between 1.7 and 2.2 K. Hence, the 6-day forecast of the meteoblue multi-model is as good as the 1-day forecast of a ‘stand-alone’ (RAW) numerical weather forecast model.
The model uncertainty of the forecasted 10 m wind speed is within 1.5 – 1.7 m s-1 by using ‘stand-alone’ weather forecast models and for historical data 1.5 m s-1 by using the reanalysis model ERA5. The model error could be reduced to 1.2 m s-1 for model simulations with MOS.
meteoblue calculates radiation for the land and sea surface and for atmospheric layers, both as incoming direct and indirect sunlight, as well as reflected radiation from clouds or surface. The meteoblue simulations for global surface radiation is consistent over continents and reaches a monthly mean absolute error of 1-15% in 95% of all places.
The model skill of daily precipitation events decreases with increasing precipitation intensity. Numerical weather forecast models are the best source for detection of small precipitation events. For heavy precipitation events, the model skill of satellite observations is larger than those of numerical weather forecast models. The model skill could not be increased by mixing two (or more) models for daily precipitation events.
For historical data, annual precipitation sums are calculated best by using satellite observations from CHIRPS2, which are bias corrected with the same measurement data set used for verification in this study. The model accuracy of CHIRPS2 in regions without measurement stations is therefore expected to be significantly lower and to a certain extent unknown.
The model accuracy for the dewpoint temperature is slightly lower than the model accuracy for the air temperature. MAE values are between 1.9 - 2.4 K for numerical weather forecast models and 1.6 K for a reanalysis model. The accuracy of model simulations with MOS are in a similar range to those of the reanalysis model.
A comprehensive verification study of air temperature, wind speed, precipitation and dewpoint temperature conducted over more than 10'000 meteorological stations worldwide for the year 2017 can be downloaded below:
A verification study was done for Europe in 2011, comparing the model accuracy of weather models with 40, 12 and 3 km spatial resolution, for the air temperature and wind speed by using MOS and raw models. The study can be downloaded here: