The Root Cause Behind Global System Failures Unveiled by CrowdStrike

Aug 07, 2024Ravie LakshmananSecurity / Response to Incidents

CrowdStrike, a cybersecurity firm, has released its analysis of the main cause that led to the failure of the Falcon Sensor software update, impacting millions of Windows devices across

CrowdStrike Reveals Root Cause of Global System Outages

Aug 07, 2024Ravie LakshmananSecurity / Response to Incidents

CrowdStrike Reveals Root Cause of Global System Outages

CrowdStrike, a cybersecurity firm, has released its analysis of the main cause that led to the failure of the Falcon Sensor software update, impacting millions of Windows devices across the world.

The incident known as “Channel File 291,” as previously discussed in its Preliminary Post Incident Review (PIR), has now been linked to a content validation issue that emerged post the introduction of a new Template Type. This new type aimed to enhance visibility and recognition of innovative attack methods that misuse named pipes and other Windows interprocess communication (IPC) mechanisms.

More specifically, the problem was related to a flawed content update released via the cloud. The company characterized this as a result of multiple deficiencies culminating in a crash, the most significant being a discrepancy in the 21 inputs sent to the Content Validator through the IPC Template Type versus the 20 inputs supplied to the Content Interpreter.

CrowdStrike highlighted that the error in parameter matching went unnoticed during “various stages” of the testing process, partly due to utilizing wildcard matching criteria for the 21st input during testing, along with the initial IPC Template Instances delivered between March and April 2024.

To clarify, the latest version of Channel File 291 launched on July 19, 2024, marked the initial deployment of the 21st input parameter field for the IPC Template Instance. The absence of a specific test scenario for non-wildcard matching criteria in the 21st field meant the issue wasn’t flagged until after the Rapid Response Content was dispatched to the sensors.

Cybersecurity

“Sensors that received the new version of Channel File 291 containing the flawed content faced an underlying out-of-bounds read problem in the Content Interpreter,” the company explained.

“Upon receiving the next IPC notification from the OS, the new IPC Template Instances were evaluated, performing a comparison against the 21st input value. The Content Interpreter anticipated only 20 values. As a result, the attempt to access the 21st value triggered an out-of-bounds memory read beyond the input data array’s end, causing a system crash.”

Along with rectifying the issue by scrutinizing the number of input fields in the Template Type at sensor compile time, CrowdStrike also implemented runtime input array bounds checks in the Content Interpreter to prevent out-of-bounds memory reads. It also corrected the number of inputs provided by the IPC Template Type.

“The additional bounds check prevents the Content Interpreter from executing an out-of-bounds access of the input array and crashing the system,” it mentioned. “The extra check introduces an additional layer of runtime validation to ensure the input array’s size aligns with the inputs expected by the Rapid Response Content.”

Additionally, CrowdStrike indicated its intention to enhance test coverage during the development of Template Types to encompass test scenarios for non-wildcard matching criteria for every field in all forthcoming Template Types.

Part of the sensor updates is expected to address the following deficiencies –

  • The Content Validator is undergoing modifications to incorporate new checks ensuring that Template Instances’ content doesn’t include matching criteria spanning more fields than those provided to the Content Interpreter as input.
  • The Content Validator is getting updated to exclusively permit wildcard matching criteria in the 21st field, preventing out-of-bounds access in sensors that furnish only 20 inputs.
  • The Content Configuration System has been revised with new test procedures guaranteeing that each new Template Instance undergoes testing, irrespective of the original Template Instance being tested with the Template Type creation.
  • The Content Configuration System now features additional deployment layers and acceptance checks.
  • The Falcon platform has been upgraded to offer users enhanced control over Rapid Response Content delivery.

CrowdStrike also revealed its collaboration with two independent third-party software security vendors for an extensive assessment of the Falcon sensor code focusing on both security and quality assurance. Moreover, it initiated an independent evaluation of the end-to-end quality process from development to deployment.

Cybersecurity

Furthermore, CrowdStrike committed to collaborating with Microsoft as they introduce fresh approaches to performing security functions in user space rather than relying on a kernel driver within Windows.

“CrowdStrike’s kernel driver loads during an early phase of system boot to enable the sensor to monitor and defend against malware that launches before user mode processes start,” it emphasized.

“Providing up-to-date security content, such as CrowdStrike’s Rapid Response Content, to these kernel functionalities empowers the sensor to protect systems against a dynamic threat landscape without altering kernel code. Rapid Response Content constitutes configuration data; it is neither code nor a kernel driver.”

The revelation of the root cause analysis coincides with Delta Air Lines stating it has no alternative but to seek compensation from CrowdStrike and Microsoft for causing extensive disruptions resulting in an estimated $500 million loss in revenue and additional expenses linked to thousands of canceled flights.

Both CrowdStrike and Microsoft have subsequently replied to the criticism, stating they are not at fault for the extended outage and that Delta rejected their offers for on-site assistance, suggesting the airline’s issues might be more profound than just its Windows machines malfunctioning due to the flawed security update.

Enjoyed reading this piece? Follow us on Twitter and LinkedIn for more exclusive content.

About Author

Subscribe To InfoSec Today News

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

World Wide Crypto will use the information you provide on this form to be in touch with you and to provide updates and marketing.