Evaluating accuracy III: PV power data

22 December 2023

The main application of irradiance data is in PV power modelling, either in planning and design or operational applications. When assessing the accuracy of a given model, it’s important to consider that both the input data (i.e. irradiance parameters) and the power model are sources of bias and error. If you directly compare a power model to power measurements, you’re looking at an accuracy result with the combined accuracy of both the irradiance data and the power model. Understanding how to minimise these accuracy gaps, will help you make the best decision when comparing multiple data vendors and power models.

Which kind of model?

When you’re looking at different PV Power models, it’s important to start with an understanding of the different approaches to modeling how assets generate power from irradiance.

Physical Models: Because there are so many different factors that can impact power generation, the best power models treat physical phenomena explicitly. This means calculating things like snow soiling losses, thermal efficiency, and inverter clipping separately and applying those factors to a base model that calculates power generation from irradiance. Complex and accurate physical models require additional effort. They either need to be manually programmed by the user, modelling individual modules and strings through to inverters, or manually tuned by the model provider based on measurements. More basic physical models can still achieve a high level of accuracy based on some simple asset parameters like capacity, efficiency, azimuth, and tilt.

Machine Learning Models: AI and ML models can be good at modelling complicated systems where the external factors aren’t well understood, but often aren’t good at modelling factors with outsized impacts. External factors like eclipses, snow soiling, dust soiling, inverter clipping etc. are often misunderstood by machine learning models when those factors aren’t treated explicitly. In theory, their empirical or machine learning models should be capable of noticing and predicting events like snow soiling or inverter clipping, however this requires a long training period which isn’t often feasible, and requires the model to have access to the appropriate input data and features.

For forecasting, Solcast’s approach to PV modelling is primarily physical. By knowing the PV plant/system specifications, we can instantly get “quite close” with a physical approach, and this allows our customers to specify their own physical specifications and have their own Advanced PV Power model running in minutes!

We don’t use machine learning as the primary approach because PV power measurement histories are typically short, and are affected by unspecified curtailments and outages. However, in our “Accuracy Assist” add-on to the Advanced PV power model, our team carefully inspects and cleans your measurement history using your own irradiance measurements and our satellite irradiance estimated actuals, then runs a ML model that further increases forecast accuracy. When this is done, the PV model output typically looks almost identical to your real measurements.

For resource assessment modelling, we recommend DNV’s SolarFarmer model, a more complex and explicit 3D physical modelling tool.

The next step is to start reviewing the available information from the vendors you’re assessing.

Reviewing the Vendor

Review the vendors’ documentation and published accuracy information

Data vendors should make information about their models and accuracy validation available and easy to find. This documentation should include a description of what kind of model you are looking at, and the site metadata that is required. You don’t want to select a model and then realise that you don’t have the site detail required as inputs into the model.

To make sure you end up with an accurate model, make sure the model information explains:

  • Input parameters for site metadata
  • Outputs for irradiance as well as power
  • How the model explicitly treats soiling, inverter clipping and other losses

Look for information about how the models are built, what level of academic rigour has been applied and where it has been validated. Solcast publishes information about our models for the Rooftop PV model and Advanced PV Power model on our website. On those pages you can see our models are based on models built and published through university research projects, including our work through the Australian National University in a $2.6m ARENA industry research project.

Review customer feedback, references and commercial applications

Running your own trial is time intensive, especially for forecast data. Our experience working with hundreds of organisations to run trials is that a forecast trial takes 1-2 Data Scientists or Engineers 4-6 weeks of work, plus additional time for the forecast trial duration, to draft a full report. You can cut down on much of this effort by reviewing existing commercial applications of the vendor’s data. Look for other organisations that share your use case, and ask the vendor how the data and models are being used in each case.

Solcast’s data and PV power model is in operational use globally by a range of Grid Operators, Utilities and load forecasters in Australia, Taiwan, Korea, US, UK and Germany. Each deployment of the model has included local calibration and validation in concert with users. Our team are happy to provide references and details of how our model is used in each case, without sharing commercial-in-confidence insights from our existing customers.

Once you’ve reviewed these materials from each vendor, you might choose to proceed to running your own assessment. These can be lengthy and time consuming, so in operational use cases, many organisations choose to operate a trial rather than a full comparative accuracy assessment. If you proceed to an assessment, we’ve outlined some steps to follow to ensure your analysis is as smooth and easy as possible.

Running your own assessment

See our previous articles on evaluating accuracy if you think this is right for you. As noted in those articles there are some key considerations when thinking about performing your own accuracy validation. These are summarised in the key points below:

  • Refreshing yourself on the vendor’s methodology. Depending on the vendors you are planning on assessing their methodology may have an impact on how you need to perform the assessment.
  • Understand the scope of the assessment. An accuracy assessment is a significant undertaking so it is important to consider a number of different factors to ensure that you get the insights you are looking for.
  • Note that the scope increases significantly with the additional complexity of a forecast trial.
  • Define assessment parameters. Once you have established the scope required it is important to establish the criteria under which you will assess potential vendors.

Dealing with PV Power

Unlike irradiance, modelling PV power is dependent on the site specific information. For purely machine learning models a history of un-curtailed measurements is vital, whereas physical models like Solcast’s only require information about the site configuration to achieve a sensible result, with measurements an optional extra to further tune the model. Misalignment between the on-site configuration and those provided to the PV power model will result in an unrepresentative assessment of a vendor’s accuracy.

In the case of the Solcast model there are some basic site configuration parameters that need to be provided, specifically, latitude and longitude to locate the site, AC and DC capacities, grid export limit, tracking type, and install date which is used to estimate age based derating. However, to get even better estimates from our PV model, additional information about the modules and inverters, the panel or array geometries, and site terrain and soiling can be provided. The full list of parameters that can be provided to the Solcast model is available here.

Setting Basic Site Configuration:

Eval AccuracyIII_Tracking.png

In some cases the actual values for some of these parameters may not be known. Sometimes the information was lost, isn’t available, or maybe varies from the original plan. This is where historical production actuals can be used to help refine or review the site configuration by comparing modelled against actual production. As there may be non-meteorological impacts on generation, it is also helpful to have measurements of irradiance and flags of periods such as curtailment during this process.

Rather than doing this yourself, Solcast is able to provide support from our data science team, so speak to a sales representative about PV power ‘Accuracy Assist’ if you are planning on running an accuracy assessment.

Tuning Specific Site Configurations:

Eval AccuracyIII_Tuning.png

For operational forecasting you may want to capture these non-meteorological conditions, so Solcast allows you to provide some options at query time to override the site specification and apply deviations from normal operating conditions. This allows you to account for situations such as reduced inverter availability or curtailment. The full list can also be found on our Advanced PV Power model page or in our API docs.

Assessing Accuracy

Once you have accounted for the additional complexities of that come with assessing PV power accuracy, it is important not to forget the other key steps in ensuring an accurate assessment as discussed in the earlier articles in this series, assessing historical irradiance data and assessing forecast data.

Aligning Timeseries:

This ensures that each data point represents that same period by matching timezones and temporal resolution. Eval AccuracyIII_Resamples.png

Applying Quality Control:

Removing periods of invalid measurements that would corrupt your assessment. This commonly needs to be done on measurement data that has missing or corrupted data points. Eval AccuracyIII_QC.png

Evaluating Against Assessment Criteria:

Returning to the assessment criteria you defined earlier and performing qualitative and/or quantitative assessment to ensure that a vendor meets your requirements.

Raw Aligned QC'd
bias 11.00% 10.52% 3.36%
nRMSE 112.04% 38.23% 13.21%
nMAE 73.27% 12.61% 5.56%
corr 0.60 0.96 0.99
Dr. Hugh Cutcher

Dr. Hugh Cutcher

Data Scientist • Author

Hugh is a Data Scientist at Solcast. He holds a Bachelor of Engineering (Hons. I) in Mechanical Engineering and a PhD in Combustion from University of Sydney. Hugh believes that renewable energy is critical to ensuring a cleaner and safer world going forward and is excited to play a part in helping fulfil that potential.