From The Team

Canary Testing: The Key to Safely Rolling Out New Controls

Authored by:

Last Updated:

November 13, 2024

Every utility is familiar with the idea of piloting a new idea before deploying it more widely. But pilots typically only happen once - when a project is getting started. What if several projects are rolling out at the same time which might interact? Or the customer landscape is changing so fast that the pilot isn’t representative any more? The big, one-time pilot makes it hard to adapt the technology as a project scales up.

When the grid is changing fast, how do you safely test a new approach, or update existing technology, without starting over?

Before implementing any DER control changes on a utility’s grid, our team uses an approach known as “canary testing” to verify the changes at a very small scale.

What is canary testing?

Named after the canary in the coal mine, Canary testing is the practice of making changes in a series of staged releases. The developer rolls out a software update to one device first, then a small portion of customers or devices, each time testing and monitoring for issues. Once the change is sufficiently validated in a real-world environment, the update is rolled out to the rest of the customers or devices. With the focused monitoring of a small-scale change, operators can catch issues early and remedy them before they balloon.

This helps prevent scenarios in which software changes cause large numbers of customer devices to simultaneously crash, lose connectivity, or stop working properly in other unanticipated ways. It also helps to catch unexpected interactions between different systems, scenarios not covered in lab testing, and problems which only show up at scale. This end-to-end system testing approach allows engineers to adapt to the unique requirements of the production environment with minimal risk, which enables critical changes (such as security updates) to be rolled out quickly and safely.

For years, canary testing has been used by large technology companies like Google and Facebook to safely and reliably make changes in the control infrastructure used to manage live, large-scale systems.

What happens when you don’t canary?

Examples of botched changes to critical systems exist across every industry. A well-known example is the failed launch of Healthcare.gov’s website in 2013.

When the site was launched on October 1st, high website demand (250,000 users or 5x more than expected) caused the site to crash within 2 hours. Drop down menus were incomplete and insurance companies received forms with missing data.

In addition, the website’s login feature (the first step to accessing the site) could handle even less traffic than the main website, creating a huge bottleneck. Website technicians relied on the exact same log-in method, making it extremely difficult for them to troubleshoot problems.

A total of 6 users completed and submitted their applications and selected a health insurance plan on the first day.

With an embarrassingly-public debacle at hand, government leaders brought in a team of software experts from the broader tech world (including two Camus team members) to revamp the site. By the end of December, 1.2 million customers had signed up for a healthcare plan via the website.

Ultimately, inadequate testing and preparation proved to be an expensive mistake. The cost of the healthcare.gov rollout increased from the $94 million budget to a final cost of ~$1.7 billion. If healthcare.gov had tested the integrated system prior to launch, they could have saved substantial time, money, and trust.

Lessons for utilities

Put simply: disaster is preventable. Canary testing is both affordable and easy to implement. For utility leaders responsible for maintaining reliable service, it should be an essential step when rolling out changes to any critical operational capabilities.

Traditional pilot structures work well to manage risk in a system where only a few things are changing at a time, but when many things are changing at once, a single testing cycle isn’t enough.

Every change needs to be verified at the system level, even after it goes through an initial qualification process - but with well-designed canary test procedures that allow operators and engineers to notice and quickly address problems, moving fast can be safer than moving slowly.

Canary testing in practice at Holy Cross Energy

We’ve put canarying into practice with each of our customers.

During an early phase of our work with Holy Cross Energy (HCE), we established a standard canary testing protocol that qualifies each type of DER before implementing new controls.

The protocol uses utility-owned batteries, solar arrays, and EV chargers on the HCE campus as the initial testing ground for every code change. The steps include verifying connectivity with each device, confirming that commands can be sent to and successfully executed by the device, and then testing the new control on the device. Once we’re confident that changes are working as expected in the canary environment, we can roll them out to manage utility and customer devices across the whole grid.

Together, our teams use canary testing on a regular basis, typically every few weeks, to safely launch new features and make system changes. This lets us build, test and deploy new changes in months instead of years.

Bringing the canary to your organization

For utilities who see changes coming for their teams and systems, we’d recommend creating your own canary testing protocols and processes. Need help? Check out these great resources or reach out to me at steven@camus.energy, and I'll connect you with our team's experts.

Resources:

‍

If you’d like to learn more about how we’re leveraging technology best practices to support community utilities, subscribe to our blog.