CrowdStrike outage was caused by a bug in quality control process

News -->
CrowdStrike outage was caused by a bug in quality control process
Caption not available

The cybersecurity firm CrowdStrike says a bug in the quality control process led to a botched update that caused the 19th July Microsoft IT outage that crippled operations at companies, airports and banks worldwide.

Services from banking to aviation were disrupted as significant losses mounted. Initial airport delays escalated into widespread flight cancellations, translating into broader economic repercussions.

Beyond disrupting travel plans, the airline system breakdown impacted global supply chains heavily reliant on air cargo, exposing the intricate interdependence of modern IT systems.

Numerous TV and radio stations also experienced broadcast interruptions, while essential services, like banking and supermarket operations, were knocked offline.

Sectors That Were Hit Hard by the Outage

The travel sector was particularly hard hit, with airports in major cities like Tokyo, Amsterdam, Berlin, and across Spain experiencing system failures and subsequent delays. Major airlines, including Ryanair, faced booking system disruptions and operational challenges.

The impact extended beyond travel, with medical systems, broadcasting, and financial services experiencing outages. Banks in Australia, India, and South Africa, as well as London's financial hub, LSEG Group, reported service disruptions.

The outage also affected Ugandan banks. Two leading financial institutions reported failures in their systems at branches and online banking platforms. Delays in operations were suffered, consequently affecting productivity and revenue, as many leading banks in Uganda use the CrowdStrike software in their daily operations.

No sector was immune to the global tech outage, including several online casinos and gambling operators also reported technical issues on 19th July as the flawed update caused widespread disruption.

Microsoft’s Swift Response

While the error was not primarily Microsoft’s fault, the problem became theirs to solve accordingly, and they quickly classified the incident as a "sev0,". This is the most critical emergency level that directly affects their products and services.

This rare designation triggered an immediate and all-hands-on-deck response, with engineers mobilized to diagnose and fix the issue.

Microsoft reported about 8.5 million Windows devices being affected by the CrowdStrike update. Additionally, the tech giant said it was not responsible for the glitch and has since worked around the clock with CrowdStrike to restore normalcy.

Meanwhile, the U.S. House of Representatives Homeland Security Committee sent a letter to CrowdStrike CEO George Kurtz asking him to testify regarding the cause of the outage. The CrowdStrike boss revealed that the issue was not a security incident or cyber-attack.

CrowdStrike Reveals the Cause of the Global Outage

Computers running Microsoft’s Windows operating system crashed and displayed the infamous "Blue Screen of Death" due to a flaw in CrowdStrike's Falcon, an advanced virus scanner that secures systems against hackers and malicious software.

In their statement, CrowdStrike said, "Due to a bug in the Content Validator, one of the two Template Instances passed validation despite containing problematic content data."

In non-technical terms, the faulty internal quality control mechanism allowed problematic data to bypass the company’s security checks.

The update error on the Falcon sensor software pushed through from the firm meant the affected devices could not load correctly.

When deployed automatically to millions of PCs around the world, it inadvertently tampered with the Windows boot sequence, putting them into a perpetual recovery boot. CrowdStrike did not specify the exact nature or problem with the content data involved.

To prevent such a disaster from occurring again, the company said it had implemented an additional quality control measure, which it termed a “new check”.

Experts are however concerned that many organizations lack robust contingency plans to address critical system failures. This vulnerability was highlighted by the recent incident, emphasizing the potential consequences of relying on single points of failure within IT infrastructure.

The Outage Has Been Classified as the Biggest in History

Based on the extent of damage, this cyber event has surpassed all previous hacks and outages, considering the number of businesses, companies, service centers, and airports that were brought to a standstill.

The May 2017 WannaCry cyberattack, which is said to have affected about 300,000 computers across 150 countries, is the most similar to the CrowdStrike outage. A month later, after the WannaCry cyberattack, there was another costly and disruptive attack known as NotPetya.

In 2021, Meta, the company that powers Facebook, Instagram, and WhatsApp, experienced a significant six-hour outage as well; however, this was mainly limited to the social media platforms and a few affiliated partners.

About CrowdStrike

Established in 2011, CrowdStrike is a publicly traded cybersecurity firm renowned for its role in high-profile investigations, including those involving Sony Pictures and the Democratic National Committee.

The company’s main objective is to stop breaches and drive businesses. It provides threat intelligence and cyberattack defense to numerous major corporations, such as Microsoft and numerous major airlines.

The company whose headquarters are in Austin, Texas also specializes in endpoint security solutions, with its Falcon Endpoint and identity protection platform designed to protect Windows servers and machines from cyberattacks.

The platform contains various modules, the Falcon Sensor being one of them, and an update to this module triggered the widespread IT error. It remains to be seen whether Microsoft will limit CrowdStrike's access to the Windows operating system in the wake of the outage.

Reader's Comments

LATEST STORIES