Irradiance and Weather Data

How Solcast generates irradiance and weather data

Irradiance and Weather Methodology

This page outlines the data inputs, models and algorithms used in the production of Solcast’s global-coverage Irradiance and Weather data, across Historic (2007 to -7 days), Live (-7 days to present moment), and Forecast (present moment to +14 days) time periods. PV power models and algorithms are outlined in separate pages. For further product and accuracy information, users may refer to other Solcast documentation, including Product Guides, Product Specifications, and Accuracy and Validation Reports.

Irradiance Models

This section details the models used by Solcast to produce irradiance data. The following sections of this page detail the inputs to those models. The Solcast data parameters determined using these models are Global Horizontal Irradiance (GHI), Direct Normal Irradiance (DNI), Diffuse Horizontal Irradiance (DHI), and Global Tilted Irradiance (GTI). The parameters indirectly determined using these models are Snow Soiling Loss – Rooftop, and Snow Soiling Loss – Ground Mounted (in conjunction with temperature and precipitation data). Solcast irradiance data, along with this snow soiling data, is also used to determine the PV parameters Advanced PV Power Output and Rooftop PV Power Output, please refer to PV modelling methodology documents.

Overview

The Solcast method for estimating solar irradiance from geostationary weather satellites and weather model data consists of four major modelling steps, depicted in grey boxes in the following schematic diagram. The first step (Solcast Cloud Model) is the detection and characterisation of clouds from satellite imagery.

Clear Sky Model

The purpose of a clear sky model is to determine the available solar irradiance under clear skies (i.e. before clouds are considered), including the effects of aerosols (dust, salt, etc.), water vapour, air pressure, elevation and surface reflectivity (albedo).

The clear sky model used by Solcast is the REST2v5 model (Gueymard 2008), with minor proprietary configurations and improvements. The REST2 model, which is widely used and heavily validated, breaks down solar irradiance attenuation into spectral regions, to represent the impact of aerosols (including dust, salt, smoke, pollution etc.), ozone, and water vapour.

Solcast runs REST2 using time-dynamic and location-specific inputs for aerosol, albedo, water vapour and ozone. In addition, global elevation data at 150m is used as a time-static location specific input.

Solcast Separation Model

A separation model, otherwise known as a diffuse model, determines the decomposition of global irradiance into its direct (beam) and diffuse components, providing values for DNI and DHI.

The Solcast Separation Model is a proprietary model based on machine learning methods, which uses a range of inputs including the REST2 output, and Solcast’s own cloud detection and tracking data.

Transposition Model (Hay, Reindl)

A transposition model transposes irradiance components to plane-of-array, enabling GTI to be calculated, which is typically the final step before PV modelling.

For transposition to plane of array from diffuse and direct components, Solcast uses the Hay model (Hay & Davies, 1980) for Advanced PV Model sites, and the Reindl model (Reindl et al. 1990) model for Rooftop PV model sites.

Solcast cloud model and inputs

This section details the inputs and algorithms used within the Solcast Cloud Model, which is to produce irradiance data (and later PV power data). The Solcast cloud model is also the determinant of the data parameter Cloud Opacity.

Cloud model input data usage across Historic, Live and Forecast time periods

The Solcast Cloud Model uses two primary classes of input data: Geostationary Meteorological Satellite (GMS) data, and Numerical Weather Prediction (NWP) data. The GMS data are used for Historic (2007 to 7 days before present time) and Live (7 days before present time to the present time) data, and the first 4 hours of Forecast data. The NWP data are used for longer horizon Forecasts as far as 14 days ahead. These data inputs, their characteristics and usage are summarised in the following table.

Solcast API time period Historic
(2007 to -7 days)
Live
(-7 days to present time)
Forecast
(present time to +14 days)
Cloud model data source GMS Satellite archive Live GMS Satellite data Nowcast (GMS Satellite extrapolation) Nowcast to NWP blend Numerical Weather Prediction (NWP)
Timespan of input data usage 2007 to - 7 days before present time -7 days to present time Present time to +2 hours ahead +2 to +4 hours ahead +4 hours ahead to +14 days ahead
Spatial resolution Modelled irradiance:
150m2
Modelled cloud:
2km
Source data:
0.5 - 5km
Modelled irradiance:
150m2
Modelled cloud:
2km
Source data:
0.5 - 5km
Modelled irradiance:
150m2
Modelled cloud:
2km
Source data:
0.5 - 5km
Blend Modelled irradiance:
150m2
Modelled cloud:
2km
Source data:
11 - 27km
Time resolution Modelled:
5 min
Source data:
5 - 60 min
Modelled:
5 min
Source data:
5 - 60 min
Modelled:
5 min
Source data:
5 - 60 min
Blend Modelled:
5 min
Source data:
60 - 180 min
Data sources NOAA GOES
EUMETSAT Meteosat
JMA MTSAT & Himawari
GOES 16 and 17
Meteosat-08 and 11
Himawari 8 and 9
GOES 16 and 17
Meteosat-08 and 11
Himawari 8 and 9
Blend NOAA GFS
ECMWF IFS
BOM ACCESS-G (UKMO)
WRF
Satellite cloud detection algorithms for Historic and Live data

For the Historic and Live periods, the Solcast Cloud Model uses high-resolution raw imagery from a range of geostationary meteorological satellites, from which cloud properties are diagnosed using a combination of peer-reviewed, industry-standard models, and proprietary algorithms developed at Solcast from 2016 to present. As described in the above section, the core input we use for detecting clouds from space is GMS satellite imagery. The present generation of satellites from NOAA, EUMETSAT and JMA (satellites in 5 different positions) produce scans of the Earth every 5 to 15 minutes at resolutions as fine as 500m. Theses satellites provide global coverage except polar regions, and Solcast utilises this coverage over all continents and surrounding islands, except Antarctica.

In pre-processing steps, raw satellite imagery is georeferenced and projected onto a regular grid. The channels are converted to regular units. Visible channel data is converted from observed radiance to (bi-directional) reflectance, the fraction of reflected incoming radiation, which is independent of solar zenith during daytime periods. Infrared channel data is converted to brightness temperature, which is closely correlated to cloud-top temperature where there are clouds, and surface skin temperature elsewhere. Automated quality control is applied to catch imagery artefacts, such as swathes of empty data and striping, before they can corrupt downstream processing.

The standardised imagery is then converted into an uncalibrated initial estimate of cloud opacity, based on analysis of recent day imagery. The visible channel reflectance is differenced with a pre-computed estimate of the clear-sky surface reflectance (based on a synthesis of recent past satellite imagery and other surface analyses including for snow presence and depth). This estimate is then further processed using information from a combination of IR imagery and NWP forecast data to account for confounding phenomena such as sun glint over tropical oceans, snow, salt pans, and other artefacts. Data are gap-filled across periods of satellite outages or bad satellite scans, using a combination of same-day satellite data and also weather reanalyses, in a combination that depends on the duration of the gap.

Solcast purposefully avoids using or providing data for dates prior to 2007, for the following reasons
(i) the poorer quality of satellite data from earlier sensing platforms, including poor temporal and spatial resolution, poor geolocation, and more frequent artefacts;
(ii) the lack of surface measurement data from earlier periods; and
(iii) rapid global climate change, which can alter local cloud climatology, thereby offsetting the potential benefits of a longer data period for estimating interannual variability.

Cloud tracking and nowcasting algorithms for short-range Forecast data

For the earliest part of the Forecast data period, typically 0 to 4 hours ahead, the Solcast Cloud Model spatially extrapolates the satellite cloud detections forward in time using analysis of their recent movement. This approach improves the accuracy of short-range forecasts, and allows major cloud-driven ramp events to be anticipated. This extrapolation is the sole driver of Forecast irradiance and PV power data for the first two hours, and is then blended progressively through the third and fourth hours to the NWP-driven forecast described in the following section.

The spatial extrapolation of detected cloud uses flow fields derived with the assistance of proprietary computer vision algorithms, trained on a sequence of recent satellite image data. This forecast process is updated with every new satellite scan, and operates on a probabilistic framework, which is used to generate the 10 and 90 percentile scenarios that are included in all Solcast irradiance and PV power forecast data. The probabilistic data is based on actual cloud variability and movement rather than long-term statistics, this allows for a sharper and more dynamic probabilistic forecast envelope. Our approach is particularly effective in capturing fast moving, fast changing cloud cover conditions. It is also effective in predicting cloud cover advection under weather conditions where vertical cloud layers are moving in different directions at different heights and at different speeds.

Cloud medium range forecast algorithms for longer horizon Forecast data

For the remainder of the Forecast data period, i.e. from 4 hours ahead to 14 days ahead, the Solcast Cloud Model uses gridded weather model forecasts (called NWP), from the models listed in the table in Section 3.1 above. Forecast data from these models are subjected to Machine Learning using power and weather measurements and our satellite-derived cloud measurements. Probability measures are based on the full set of NWP scenarios, calibrated for reliability using past measurements. The probability measures are output as 10th and 90th percentile probabilistic forecast bands. As part of the machine learning framework, NWP models are dynamically weighted according to their recent performance, with preference to the current best performers.

Aerosol, Albedo and Ozone data inputs

This section details the aerosol, albedo and ozone data inputs used by Solcast. These inputs are used in the production of irradiance data (and later PV power data) according to the above. The albedo inputs are also used more directly to generate the Albedo Daily data parameter, an interpolated (smoothed) value of surface reflectivity that does not capture any diurnal angular dependence of reflectivity.

Data sources for aerosol, albedo and ozone data

Solcast uses two sources of global, time-dynamic, observationally-bound gridded data as inputs to produce its aerosol and albedo data. Firstly, the Modern-Era Retrospective analysis for Research and Applications Version 2 (MERRA2) atmospheric reanalysis from NASA is used for all time periods where it is available, which is for historical data up to 2 months ago. Secondly, the Copernicus Atmosphere Monitoring Service (CAMS) global analysis and forecast model from the European Centre for Midrange Weather Forecasting (ECMWF). Each of these sources has components that assimilate a wide range of measured satellite retrievals, and also components that propagate and evolve data forward from and between measurements. Solcast performs spatial and temporal interpolation on these input data, and performs bias corrections in order to maintain consistency and minimise bias in the resulting irradiance data.

Solcast API time period Historic
(2007 to -7 days)
Live
(-7 days to present time)
Forecast
(present time to +14 days
Data source NASA MERRA-2 ECMWF CAMS
Timespan of input data usage 2007 to -2 months before present time -2 months before present time to +14 days ahead
Spatial resolution Modelled: 0.025° x 0.025°
Source data: 0.5° (latitude) x 0.625° (longitude)
Modelled: 0.025° x 0.025°
Source data: 0.4° x 0.4°
Time resolution Modelled: 5 minutes
Source data: 3 hours
Modelled: 5 minutes
Source data: 60 minutes

Other weather parameters

This section details data inputs used by Solcast in the production of the remaining weather parameters, which include the Solcast data parameters: Air Temperature, Wind Speed (10 & 100m), Wind Direction (10 & 100m), Relative Humidity, Surface Pressure, Precipitable Water, Precipitation Rate, Dewpoint Temperature, and Snow Depth Water Equivalent.

Surface Pressure and Precipitable Water are also used as inputs to the Clear Sky Model, which is used in the production of irradiance data (and later PV power data), as described above.

Air Temperature and Precipitation Rate data are also used in the production of Snow Soiling Loss – Rooftop, and Snow Soiling Loss – Ground Mounted data parameters, in conjunction with the irradiance models described above.

Data sources for other weather parameters

Similar to the approach taken for aerosol and albedo data, Solcast uses two sources of global, time-dynamic, observationally bound gridded data as inputs to produce its other weather data. Firstly, the ECMWF interim reanalysis (ERA-Interim) atmospheric reanalysis from ECMWF is used for Historic data up to August 2018. Secondly, analyses and forecasts from the Global Forecast System (GFS) NWP model from NOAA are used for all other time periods from May 2019 to +14 days ahead of present time. Each of these sources has components that assimilate a wide range of surface observations and measured satellite retrievals, and components that propagate and evolve data forward from and between measurements. Solcast performs spatial and temporal interpolation on these input data.

Solcast API time period Historic
(2007 to -7 days)
Live
(-7 days to present time)
Forecast
(present time to +14 days
Data source ECMWF ERA-Interim NOAA GFS
Timespan of input data usage 2007 to May 2019 May 2019 +14 days ahead
Spatial resolution Modelled: 0.025° x 0.025°
Source data: 0.7° x 0.7°
Modelled: 0.025° x 0.025°
Source data: 0.25° x 0.25°
Time resolution Modelled: 5 minutes
Source data: 6 hours
Modelled: 5 minutes
Source data: 60 minutes