Release the chaos Monkey – CRAHs down!

In the final stages of commissioning LC2, I took it upon myself to explore the resilience of our systems by utilizing various networking tricks, with the most destructive being the notorious network broadcast storm. For those unfamiliar with network storms, you can learn more here: https://www.techopedia.com/definition/6270/broadcast-storm

Following this experiment, we now mandate that all vendors perform an actual broadcast storm test and document the results. Should a unit fail to remain operational during the storm (i.e., it shuts down or becomes unresponsive from the front panel), we will not proceed with installation. Thankfully, there are remedies for such issues. In the case of the unit tested, we integrated a small firewall and separated the user interface network from the control network, ensuring that the storm could run without causing any actual failures.

We also evaluated our security system and discovered that all the cameras went offline during the network storm. Network storms can be extremely disruptive! In the past, we relied on ring topologies, but due to the risk of ring collapse, we now always employ a double star topology for added stability and redundancy. Rigorously test your equipment and design your networks with these considerations in mind. Have you ever encountered a network storm? Share your experiences in the comments!

Video shows the CRAH shutting off!
Vido shows the touch display on the CRAH unit unresponsive!

Leave a reply:

Your email address will not be published.

Site Footer