February 15, 2023

New inputs and algorithms drive large improvements to Solcast’s Historical Time Series data

Historical irradiance data is essential for many use cases. Predicting future yield, validating models and understanding trends is impossible without a historic data set you have confidence in.

Solcast develops proprietary weather and ML algorithms to track clouds globally using geostationary satellite data. By combining this with aerosol data and other “clear sky” inputs, we’re able to build an accurate picture of solar irradiance around the world and back through time. Because these models are validated against research-grade, quality-controlled surface measurements, continuous model improvements can be trained and tested.

Read more below about these improvements, which are now live in Solcast’s Historical Time Series (HTS) data, and also our Live and Forecast data. Solcast customers using the HTS data set are already benefiting from these improvements. To check them out for yourself, you can access 3 time series for free through our Solcast Toolkit.

Large improvements: Spread of Solcast bias reduced by 15%

We quantified our improvements by comparing our satellite-derived estimates with quality-controlled measurements of irradiance from 72 research-grade stations located around the world across a variety of climate types and latitude zones. This almost doubles the number of measurement sites compared to our previous analyses, and covers over 400 site-years of GHI and DNI measurements.

Although we focused our improvements only on the “clear-sky” parts of our models (i.e. the irradiance estimates before the impacts of any clouds), we still made a significant positive impact on our “all-sky” statistics as you can see in the table below.

One of the results we’re happiest with is the spread of our site biases, which we call “std(bias %)”. Whilst our average bias is near zero, this statistic indicates how from from this zero average a user can expect our bias to be at any given site, which is important for those planning and financing solar plants, since it ties to the risk of the plant’s production projections being badly wrong. In relative terms, we’ve reduced this measure by 15%.

All-sky Statistics across all sites below 2500m elevation







mean(bias %)
"on average, what should the difference in estimated and actual irradiance be?"





std(bias %)
"How much might I expect any given site to deviate from that bias?"





mean(RMSE %)
"On average, what error should I expect?"





mean(MAE %)
"as above, but don't square the error"





For full validation results and more about our methodology, see our validation and accuracy document.

Improved local aerosol estimates

It’s not just clouds that affect solar radiation. Creating high quality global solar irradiance data requires a detailed model that considers aerosols in the atmosphere such as dust, salt, smoke and ash. Aerosols are especially important for solar energy, since so much solar PV is being installed in deserts or semi-deserts (where aerosols can trump clouds in importance), and in highly industrial regions (where aerosol loadings can be high).

Solcast continues to use the best global models for our aerosol inputs (being NASA MERRA2 for historical and ECMWF CAMS for real-time and forecast), however we have made major improvements to how we use these global models to represent local aerosol conditions. We’ve validated our improvements at hundreds of AERONET sites globally. For many of these sites, the new aerosol inputs we’re using are drastically better, sometimes as much as 5 to 10 times better.


AERONET locations worldwide. Source: AERONET

NASA project MODIS helps to improve albedo inputs

Albedo (the reflectivity of the earth’s surface) matters for accurate irradiance estimation, not only to estimate the reflected component of tilted irradiance at the local site itself, but also to estimate the amount of irradiance reflected and re-scattered in the broader region surrounding the site, which can significantly boost or diminish the downwelling diffuse irradiance.

Switching to finer resolution detail from NASA MODIS satellite imagery products has helped us significantly improve albedo calculations. The higher resolution images captured by the satellite allow for a more accurate estimation of albedo, compared to the NASA MERRA2 albedo data previously used.. The additional spatial resolution in this imagery represents a step-change in our ability to estimate albedo at a global scale.

As you can see in this MODIS image of Tenerife in the Canary Islands, the fine-detail resolution allows us to see very local changes in the land surface, which lead to improved albedo estimates.



New inputs and algorithms drive large improvements to Solcast’s Historical Time Series data

Harry Jack

Data Scientist


Harry is Solcast's Lead Modeller and forecast systems engineer, leading our modelling team and responsible for data quality and data value across Solcast. Harry holds a BEng (Hons I) in Chemical Engineering from the University of Sydney, and a Grad. Dip. in Meteorology from the Australian Bureau of Meteorology (BoM). He has worked as an operational Meteorologist and as a forecast systems scientist at the BoM.