Upgrading Spatial Safety for hundreds of millions of lines of C++ code

Authored by Alex Rebert and Max Shavrick, Security Foundations, and Kinuko Yasada, Core Developer

Hackers frequently leverage spatial memory safety weaknesses, which arise when code accesses memory beyond its designated boundaries, to jeopardize systems and confidential information. These vulnerabilities pose a significant risk to users.

According to an evaluation of real-world exploits monitored by Google’s Project Zero, spatial security vulnerabilities constitute 40% of memory safety breaches reported in the past ten years:

Overview of memory safety CVEs exploited in live attacks categorized by vulnerability type

Google is embracing a holistic approach to memory security. A fundamental aspect of our strategy centers on Secure Programming and the use of memory-safe programming languages in new code. This results in a significant reduction in memory vulnerabilities and rapidly enhances the overall security stance of a codebase, as evidenced by our blog post on Android’s progress towards memory safety.

Nevertheless, the process of transitioning will span multiple years as we adjust our development methodologies and infrastructure. Safeguarding the well-being of our vast userbase necessitates us to take additional measures: we are also incorporating secure-by-design principles into our existing C++ codebase wherever feasible.

With that objective, we are striving towards integrating spatial memory safety into as many of our C++ code repositories as possible, encompassing Chrome and the extensive codebase underpinning our services.

We have initiated the process by activating fortified libc++, which incorporates boundary checks to traditional C++ data structures, eradicating a substantial range of spatial safety issues. While C++ will not achieve complete memory safety, these enhancements mitigate risks as elaborated further in our outlook on memory safety, resulting in more dependable and secure software.

This article elaborates on how we’re implementing fortified libc++ throughout our codebases and showcases the beneficial effects it is already yielding, including thwarting exploits, diminishing crashes, and enhancing code accuracy.

Boundary-verified data structures: The bedrock for spatial safety

One of our primary approaches to enhancing spatial safety in C++ involves implementing boundary checks for prevalent data structures, commencing with fortification of the C++ standard library (specifically, LLVM’s libc++). Reinforced libc++, recently introduced by contributors in the open-source community, incorporates a series of security validations aimed at capturing vulnerabilities such as off-limit accesses in production.

For instance, fortification of libc++ guarantees that every interaction with an element of a std::vector remains within its defined limits, preventing any attempts to read or write outside the valid memory area. Similarly, hardened libc++ verifies that a std::optional is not empty before allowing access, preventing interactions with uninitialized memory.

The strategy emulates the prevailing approach in numerous contemporary programming languages such as Java, Python, Go, and Rust. These languages all encompass boundary validation as a default setting, acknowledging its vital role in averting memory-related errors. Notably, C++ has deviated from this convention, but initiatives like hardened libc++ are striving to bridge this discrepancy in our infrastructure. It’s important to highlight that comparable fortification is accessible in other C++ standard libraries like libstdc++.

Elevating the security foundation universally

Expanding on the effective implementation of hardened libc++ in Chrome in 2022, we have now enforced it as the standard configuration for our server-side production systems. This enhances spatial memory security throughout our services, encompassing critical performance components of products such as Search, Gmail, Drive, YouTube, and Maps. While a minor set of components have chosen to opt out, we are actively striving to diminish this number and elevate the standards for security universally, even within applications with lower risk of exploitation.

The impact on performance from these modifications was surprisingly minimal, notwithstanding Google’s contemporary C++ codebase heavily relying on libc++. Enhancing libc++ resulted in a typical 0.30% performance impact across our services (yes, only a third of a percent).

This outcome is attributed to both the compiler’s capacity to remove redundant checks during optimization, and the effective structure of hardened libc++. While a few critical performance code pathways necessitate specific use of access that is explicitly unsafe, these instances are meticulously assessed for their safety. Approaches like optimizations guided by profiles significantly increased performance, however, the expense of boundary checks remains trivial even without those sophisticated methods.

We actively observe the impact on performance of these checks and strive to mitigate any excess overhead. For example, we detected and rectified an unnecessary check, resulting in a 15% decrease in overhead (reduced from 0.35% to 0.3%), and contributed the fix back to the LLVM project to distribute the advantages throughout the wider C++ community.

Although the impact of hardened libc++ on individual applications is mostly negligible, implementing it at Google’s magnitude necessitated a significant dedication of computational resources. This investment underscores our commitment to bolstering the safety and security of our products.

From evaluations to implementation

Enabling enhanced libc++ wasn’t as straightforward as toggling a switch. Instead, it necessitated a phased introduction to prevent inadvertent disruptions to users or service outages:

Assessment: We initially activated fortified libc++ in our evaluations over a year ago. This allowed us to uncover and address numerous formerly unnoticed flaws in our code and assessments.
Evaluation: The solidified runtime was allowed to mature in our evaluations and pre-deployment environments, providing developers with the opportunity to adapt and rectify any emerging challenges. Extensive performance assessments were also conducted to ensure minimal impact on user experience.
Gradual Launch to Production: The fortified libc++ was subsequently introduced to production over several months, commencing with a small selection of services and gradually encompassing our complete infrastructure. The deployment progress was closely monitored, promptly addressing any crashes or performance declines.

Measurable outcome

In merely a few months since implementing fortified libc++ as default, we’ve already witnessed advantages.

Thwarting attacks: Fortified libc++ has already obstructed an internal red team drill and could have averted another one that occurred prior to our implementation of strengthening, showcasing its efficacy in foiling attacks. The security validations have exposed over 1,000 flaws and would forestall 1,000 to 2,000 novel flaws annually at our present rate of C++ advancement.

Enhanced dependability and accuracy: The procedure of recognizing and rectifying flaws unveiled by fortified libc++ resulted in a 30% decrease in our fundamental segmentation fault rate throughout production, signifying better code dependability and quality. Apart from crashes, the verifications also flagged errors that would have otherwise appeared as unforeseeable behavior or data distortion.

Moving average of segfaults across our fleet over time, before and after enablement.

Simplified troubleshooting: Fortified libc++ empowered us to pinpoint and rectify multiple flaws that had lingered in our code for more than a decade. The verifications convert numerous challenging-to-diagnose memory distortions into immediate and readily diagnosable errors, saving developers valuable time and effort.

Closing the divide with memory-secure languages

While libc++ fortification renders immediate advantages by integrating bounds verification to standard data structures, it’s only one element of the puzzle when it concerns spatial security.

Expanding boundaries validation to other libraries and actively working on transitioning our codebase to Safe Buffers is now mandatory for all access points to undergo boundary verification. To ensure spatial security, incorporating fortified data structures along with their iterators, as well as Safe Buffers, is essential.

Aside from enhancing the security of our C++ ecosystem, we are also committed to simplifying the interoperability with memory-safe programming languages. Moving our C++ codebase to Safe Buffers closes the disparity between the languages, thus facilitating interoperation and potentially paving the way for automated translation.

Developing a more secure C++ environment

Strengthened libc++ serves as a practical and efficient approach to enhance the security, dependability, and traceability of C++ code with minimum additional resources. Consequently, we advocate for all organizations leveraging C++ to universally activate the hardened mode in their standard libraries by default.

At Google, initiating the use of fortified libc++ marks just the preliminary phase in our endeavor toward creating a spatially secure C++ codebase. By extending boundaries validation, migrating to Safe Buffers, and actively collaborating with the broader C++ community, we aspire to shape a future where spatial security becomes the standard.

Appreciation

Special thanks go out to Emilia Kasper, Chandler Carruth, Duygu Isler, Matthew Riley, and Jeff Vander Stoep for their invaluable input.