It was the Blue Screen of Death felt around the world.
As if you could forget, the CrowdStrike software update back in July caused major disruptions, including the cancellation of over 1,000 flights, interruptions in healthcare services, and failures in the 911 emergency system.
Fortunately, this was not a cyberattack. The outage occurred due to a content update that caused a repetitive reboot cycle on affected machines. “Windows endpoints running CrowdStrike software anywhere in the world were at risk of being impacted,” explains Ranjan Singh, chief product officer for Kaseya.
With the dust settled, Singh says MSPs can reflect back on the experience and use it to focus on three security best practices for self-improvement.
1. Seek Vendor Partners That Prioritize Quality
“As a provider of IT and security software, it behooves us that we take every step and measure to ensure that defects and bugs do not materialize in the field,” Singh says.
This includes thorough oversight, from architecture to product development to release processes. MSPs should seek to partner with vendors that prioritize quality assurance. Software companies such as Kaseya utilize comprehensive testing methodologies to identify potential issues before they impact your clients. Although perfect software is an ideal rather than a reality, minimizing defects through thorough testing can significantly reduce the risk of major disruptions.
“Disruptions are a matter of when, not if,” Singh adds.
2. Make Sure Your Tools And Support Systems Are Ready For Crisis
Stay ready by anticipating disruptions and locking in strong tools, support systems, and procedures.
The scale of the CrowdStrike outage revealed the challenges MSPs face when managing many endpoints during a crisis. Kaseya’s response—which included issuing a bulletin, exploring automation options for affected partners, and providing disaster recovery solutions—underscores the importance of having effective automation and support strategies to manage widespread disruptions and maintain operational continuity.
As MSPs, you need to build scalable processes to manage your operations and your clients’ needs, ensuring quick recovery from disruptions. Regularly test your systems to ensure they can handle major incidents and provide the support your clients need when it matters most.
3. Practice Your Disaster Recovery Plans
BCDR (business continuity and disaster recovery) testing proved to be a vital component in managing the CrowdStrike outage.
MSPs should conduct backup and recovery tabletop exercises at least once or twice a year. However, you may want to do so more often if there are significant changes in systems, infrastructure, or personnel. Regular exercises help identify gaps, improve response times, and ensure that you can handle large-scale incidents and swift recovery for your clients.
The significance of BCDR cannot be overstated. “It is the first dollar spent and it is the last line of defense,” Singh adds.
Embrace The Moment
Following these security best practices for MSPs will help you to be prepared when any kind of outage occurs, whether it’s caused by a cyber attack, a natural disaster, or software update. “The outage showed to the world just how important technology is to small businesses and highlighted the importance of the MSP,” Singh says. “You are the knight in shining armor to your clients … never forget it.”
RELATED: What The CrowdStrike Incident Taught One MSP About Preparedness and Relationships



