Historical Data Accuracy

HISTORICAL AND TMY

Bias and error validation of Solcast historical data against surface measurements

These validation analyses assist users in evaluating the accuracy of Solcast’s historical time series data. This can be useful for estimating expected accuracy for your own region(s). These studies compare to quality-controlled surface measurement data from across the world. Users focused on live and forecast data should refer to our separate live and forecast data analysis. For users interested in TMY data, these historical analyses are the relevant resource, since TMY data is sourced from the historical time series. Further information on Solcast’s TMY methodology is also available here.

The most recent study, the DNV bankability report, was completed by DNV in 2023 and published in early 2024. This study reviewed methodology and validated GHI and DNI against measurements from 207 sites globally.

An earlier study was completed by Solcast in 2022 on 70 sites, for transparency, a summary of the details of that validation is contained at the bottom of this page. The bias and error statistics of the DNV and Solcast studies are very similar.

A selection of commonly-used statistics from each study are included here. For a full range of statistics, including at site level, and to request the raw data from the Solcast report, please contact the Solcast team.

DNV Bankability Report

Executive Summary

The full report was published by DNV in February 2024. The study continued on validation work DNV started prior to acquiring Solcast in 2023. DNV reviewed Solcast’s irradiance and TMY methodology and conducted a global validation analysis of the Solcast historical time series (HTS) GHI and DNI data against surface irradiance measurements. Data was included from 207 sites of which 53 sites are considered highest quality, based on cleaning schedules, data quality and sensor classification.

The data used in the validation analysis was subject to an initial and secondary quality assurance process, and sites have been classified and grouped by continent, according to the World Bank approved administrative boundaries, and by Köppen-Geiger-Photovoltaic (KGPV) Climate Zones.

The study considers three main metrics per site and this data is provided for all sites in the full report: Mean Bias Difference/Error (MBD), Mean Average Difference/Error (MAD) and Root Mean Square Difference/Error (RMSD). The key findings for bias were:

Data

Mean Bias

Standard Deviation

GHI - Highest Quality Sites

+0.05%

±2.05%

GHI - All sites

+0.33%

±2.47%

DNI - All sites

+1.5%

±5.75%

The report concluded as follows: “DNV finds the Solcast methodology to be consistent with industry best practices. DNV finds the results of the validation study to be within expectations and that the Solcast data is suitable for use to accurately predict project energy yield in energy assessment and for use in energy assessment for project financing purposes. The Solcast bias is considered low and there are limited variations between different regions and climatic zones for both GHI and DNI.”

Review of Solcast Methodology

DNV reviewed the Solcast irradiance and TMY methodology and found that it is consistent with industry best practices. The study assessed the varying inputs into the Solcast algorithm including our sources of geostationary weather satellite imagery and weather model data. Models were reviewed and validated as separate components:

  • The cloud model: Detects and characterises clouds from satellite imagery
  • The clear sky model: Calculates irradiance under clear sky conditions
  • The separation model: Decomposes global horizontal irradiance into diffuse and direct components
  • The transposition model: Converts irradiance on a horizontal surface to plane of array irradiance (GTI was not validated in this study)
  • Terrain shading model: Accounts for beam blocking and reduced sky view due to horizon terrain

Review of Solcast’s TMY methodology was also included along with the model reviews.

Validation Methodology

Data used in the study came from publicly available data sources, measurement stations used for project development and measurements from operational assets. Site selection focussed on achieving the best possible global coverage, specifically including most major solar markets. All sites provided data at a resolution of 1 hour or finer.

Sites were classified according to World Bank administrative boundaries for region, and using Köppen-Geiger-Photovoltaic (KGPV) classification for climate zones. These classifications were used to provide regional and climate zone breakdowns of validation data, as accuracy of solar irradiance estimates are expected to vary between regions and with climatic conditions. A breakdown of the results in this way is present in the full report.

Every site included in the study passed two rounds of quality assurance and additional exclusion criteria. Quality assurance processes ensured that data was excluded which exhibited unphysical qualities, sensor drift or calibration errors or was otherwise unsuitable. Exclusion criteria applied to sites that were at a very high elevation, at polar latitudes, did not provide sufficient time coverage or fell outside satellite coverage boundaries. This process ensures that sites used are both accurate and representative of typical solar asset locations.

A total of 207 sites were used for the validation study. These sites passed through both stages of quality assurance and the exclusion criteria. GHI measurements were available for all sites, shown below, however measurements of DNI are less frequently available and were available for a total of 117 sites, also below.

DNI.png

DNV identified a total of 53 highest quality sites that were confirmed to have Class A pyranometers that were cleaned a minimum of every 2 weeks. The measurements at these locations are expected to be more accurate and validation performed at these locations more indicative of the performance of Solcast’s estimates than for other locations.

GHI-High Quality.png

Metrics were calculated against hourly values of irradiance for all locations. Equations for the metrics used can be found in the full report.

Key Results and Findings

The key results and findings of the results are below. Data for confidence intervals, regional breakdowns, and climate zone breakdowns are available in the full report.

Highest Quality GHI Sites

All GHI Sites

All DNI Sites

No. of sites

53

207

117

Mean Bias

+0.05%

+0.33%

+1.50%

Bias Std. Dev.

±2.05%

±2.47%

±5.75%

80% CI Bias (10% to 90%)

-2.57% to 2.67%

-2.84% to 3.50%

-5.87% to 8.86%

90% CI Bias (5% to 95%)

-3.31% to 3.41%

-3.74% to 4.40%

-7.96% to 10.95%

Mean nMAD (nMAE)

10.37%

10.33%

19.97%

Std. Dev. nMAD (nMAE)

±5.03%

±3.72%

±5.94%

Mean nRMSD (nRMSE)

16.16%

15.99%

31.51%

Std. Dev. nRMSD (nRMSE)

±7.94%

±5.74%

±9.99%

To understand how these results might reflect accuracy for your region and assets, you can use the below interactive map to see KGPV climate zones globally and our distribution of sites. You can select climate zone and region to see aggregate results, representative sites, and the source data for each site is available.

Locations of some sites have been obfuscated for confidentiality reasons, indicated locations are representative and in the same climate zone.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Solcast error statistics from DNV Bankability Report 2023

Source Data

See how these results compare to both free and paid alternatives in the market here.

Solcast 2022 Validation Study

Executive Summary

A global, accuracy verification analysis of Solcast Historical Data was performed in 2022, using surface measurements from 70 sites spanning a range of climate types and latitude zones. The measurements span 2007 to 2021. Results were compared to the commonly used ERA5 reanalysis dataset (publicly available from the European Centre for Medium-Range Weather Forecasts, ECMWF).

The mean bias in estimated actual GHI across all the measurement sites is -0.1%, compared with the ERA5 which has a +2.4% bias. The hourly-average MAPE of the Solcast GHI estimates across all sites is 10.7% compared with 18.4% for ERA5. For DNI, bias across all sites is +1.3% compared to ERA5 values of +23.3%, and hourly average MAPE across all sites is 23.6% compared with ERA5 values of 46.9%.

Metric

Bias
Mean & 10/90 percentile

Bias
Standard deviation

nRMSE
Mean & 10/90 percentile

MAPE
Mean & 10/90 percentile

Solcast GHI
All 70 global sites

-0.1%
(-2.4% to +2.2%)

2.2%

17.5%
(10.7% to 23.9%)

10.7%
(6.5% to 15.0%)

Solcast DNI
Global average
All 70 global sites

+1.3%
(-7.9% to +10.5%)

6.9%

40.4%
(24.3% to 58.4%)

23.6%
(15.0% to 32.9%)

Measurement site selection

This analysis uses research grade, quality-controlled surface measurements from a variety of sources, including but not limited to, the Baseline Surface Radiation Network (BSRN; global), the Surface Radiation Budget Network (SURFRAD; US) and the enerMENA Meteo Network (Middle East and North Africa). The sites used were selected for (1) quality, characterised by robust calibration and maintenance standards and quality control of data - this is challenging for irradiance measurements, where calibration and maintenance often cause large errors; (2) recency, i.e. data is available for recent periods so that the analysis can use recent Solcast algorithmic configuration, and for relevance to newer customer sites; (3) availability, as much as possible the measurements should be non-private, and readily available to users who may want to replicate the results; and (4) broad geographic and climate-type coverage, so that users can estimate accuracy for their own sites.

Map of measurement sites included in the analysis. Of the 70 included sites, a total of 35 are designated Tropical/Subtropical (21 Humid, 5 Semi-Arid, 9 Arid). A total of 35 are designated Temperate (13 Humid, 16 Semi-Arid, and 6 Arid).

Accuracy verification results

GHI ESTIMATED ACTUALS ACCURACY

The following table shows statistics for the normalised bias, Mean Absolute Percentage Error (MAPE) and normalised Root Mean Square Error (nRMSE), as defined in the above-mentioned NREL 2013 analysis, of GHI.

Errors in Estimated Actuals for GHI

Data: hourly average, nocturnal zeros excluded

Site type

Estimate

Bias
Mean & 10/90 percentile

Bias
Standard deviation

nRMSE
Mean & 10/90 percentile

MAPE
Mean & 10/90 percentile

Tropical/Sub-Tropical
Arid & Semi-Arid
14 sites

Solcast

+0.1%
(-4.2% to +3.5%)

3.2%

11.5%
(6.9% to 16.5%)

7.3%
(4.1% to 10.1%)

ERA5

+1.6%
(-1.8% to +5.7%)

3.3%

19.0%
(12.2% to 27.7%)

11.5%
(6.4% to 18.2%)

Tropical/Sub-Tropical
Humid
21 sites

Solcast

+0.5%
(-2.0% to +3.6%)

2.3%

17.9%
(13.4% to 22.5%)

11.3%
(8.3% to 15.0%)

ERA5

+2.9%
(-1.6% to +7.4%)

4.9%

32.3%
(24.7% to 38.8%)

21.2%
(16.0% to 26.0%)

Temperate
Arid & Semi-Arid
22 sites

Solcast

-0.7%
(-2.3% to +1.4%)

1.5%

19.7%
(12.7% to 27.3%)

11.8%
(7.0% to 16.0%)

ERA5

+2.9%
(-2.7% to +8.6%)

5.1%

31.1%
(21.0% to 41.2%)

18.6%
(12.6% to 23.5%)

Temperate
Humid
13 sites

Solcast

-0.1%
(-2.3% to +1.4%)

1.6%

19.6%
(13.7% to 26.0%)

11.4%
(7.4% to 15.0%)

ERA5

+1.4%
(-1.6% to +5.0%)

3.2%

34.4%
(28.1% to 41.3%)

21.2%
(15.6% to 26.1%)

Global average
All 70 sites

Solcast

-0.1%
(-2.4% to +2.2%)

2.2%

17.5%
(10.7% to 23.9%)

10.7%
(6.5% to 15.0%)

ERA5

+2.4%
(-2.0% to +7.5%)

4.4%

29.7%
(18.1% to 40.0%)

18.4%
(9.5% to 25.4%)

DNI ESTIMATED ACTUALS ACCURACY

The following table shows statistics for the normalised bias, Mean Absolute Percentage Error (MAPE) and normalised Root Mean Square Error (nRMSE), as defined in the above-mentioned NREL 2013 analysis, of DNI.

Errors in Estimated Actuals for DNI

Data: hourly average, nocturnal zeros excluded

Site type

Estimate

Bias
Mean & 10/90 percentile

Bias
Standard deviation

nRMSE
Mean & 10/90 percentile

MAPE
Mean & 10/90 percentile

Tropical/Sub-Tropical
Arid & Semi-Arid
14 sites

Solcast

-2.1%
(-12.5% to +7.3%)

8.0%

25.8%
(18.2% to 35.0%)

17.0%
(11.2% to 23.5%)

ERA5

+13.8%
(+1.6% to +29.2%)

13.4%

43.1%
(28.6% to 66.1%)

31.7%
(18.3% to 50.7%)

Tropical/Sub-Tropical
Humid
21 sites

Solcast

+4.4%
(-6.2% to +13.7%)

7.7%

41.2%
(27.3% to 57.1%)

25.6%
(18.4% to 35.6%)

ERA5

+32.4%
(+13.2% to +58.8%)

22.9%

74.1%
(45.4% to 98.1%)

55.1%
(33.8% to 75.9%)

Temperate
Arid & Semi-Arid
22 sites

Solcast

-0.8%
(-6.6% to +5.0%)

4.9%

45.8%
(24.8% to 62.9%)

25.2%
(14.9% to 32.3%)

ERA5

+21.8%
(+8.0% to +26.6%)

21.4%

69.4%
(42.7% to 94.7%)

46.8%
(30.6% to 61.7%)

Temperate
Humid
13 sites

Solcast

+3.4%
(-1.5% to +9.7%)

4.9%

45.8%
(29.2% to 58.9%)

24.5%
(15.4% to 31.8%)

ERA5

+21.6%
(+13.1% to +37.2%)

11.3%

73.8%
(54.5% to 99.2%)

50.1%
(34.3% to 70.1%)

Global average
All 70 sites

Solcast

+1.3%
(-7.9% to +10.5%)

6.9%

40.4%
(24.3% to 58.4%)

23.6%
(15.0% to 32.9%)

ERA5

+23.3%
(+6.5% to +47.6%)

19.8%

66.4%
(37.2% to 95.1%)

46.9%
(25.2% to 69.8%)

The purpose of this validation analysis is to enable users to estimate Solcast’s historical timeseries accuracy for their site(s) prior to subscription or integration effort. It is based on data across 15 years, compared to surface measurements from high-quality measurement sites. Also included are error statistics, and benchmarks against a common alternative.

Commonly Asked Questions about Historical Solar Irradiance Data Accuracy

The industry standard instrument to measure solar radiation is a class-A pyranometer. However, physical sensors require regular maintenance and calibration and can experience downtime that leads to data gaps. In 2023, DNV published a bankability study validated Solcast’s satellite-based irradiance data with measurements from Class A pyranometers cleaned at least bi-weekly. Combining measurement’s with a high-quality satellite irradiance data source, like Solcast, more accurate, reliable solar irradiance data for your assets, especially when using measurements for a fleet of assets. .

Validation is crucial for ensuring the accuracy and reliability of historical solar irradiance data. It helps identify and correct biases, improving the accuracy of solar irradiance estimates while reducing uncertainty in solar project planning and performance predictions.

Historical solar data accuracy can be evaluated by comparing the data against high-quality ground-based measurements from weather stations and solar monitoring sites or against other solar data providers. Our team has consolidated guidelines on how to evaluate satellite-derived irradiance. Comparing bankable satellite-derived irradiance to measurements from ground sensors requires the highest quality sensor data for the comparison to be accurate.

Bankable solar data meets several criteria: independent validation across multiple globally distributed sites, coverage of various geographic and climatological regions, public availability of all details, and a thorough validation report that considers methodology and models for historical and TMY data. Solcast historical data shows a bias of only 0.33% for all GHI sites and just 0.05% against the highest quality measurements, making it bankable for solar resource assessment and planning.

In solar design and financial modeling, bankability refers to the reliability and credibility of solar data, predicted performance and asset technology or management to secure financing from investors and financiers. Bankable projects are those that are likely to perform as expected and generate expected financial returns. To achieve this, you can access bankable, accurate historical solar irradiance data via the Solcast API toolkit.