This past week, Sophos engaged in the Windows Endpoint Security Ecosystem Summit hosted by Microsoft. In light of the recent CrowdStrike event where a kernel-driver update caused widespread crashes on millions of devices globally, representatives from both the business and government sectors gathered to delve deeply into topics such as kernel frameworks, rollout processes for updates, and – most importantly – how this previously obscured security ecosystem can progress transparently and with comprehensive community involvement to safeguard the globe. This was a preliminary dialogue rather than a formal session, yet a few significant themes came to light.
One of the topics revolved around the evolution of the Windows system to minimize the requirement for security firms to utilize kernel drivers, user-space hooking, or other methodologies to seamlessly and actively interact with the system, while preventing adversaries from gaining a foothold in the system’s core. Input from multiple industries, along with past successful practices in this regard, is vital in making this vision a reality. Another key focus was on deployment – the process of securely shipping software and updates to countless users with minimal disruptions.
During the discussion, Microsoft highlighted us as an exemplar of best practices and outcomes. In this article, we will elaborate on the methods and reasons behind Sophos’ current integration with the Windows platform, and touch on potential ways in which the Windows system could adapt to rebalance the techniques and access required for third-party security providers to collaborate effectively. We will also delve into the realm of Safe Deployment Practices (SDP), an area of interest that both Microsoft and Sophos explored during the summit. To conclude, we will share three instances of managing foundational changes for Mac and Linux products, serving as potential insights for broader industry discussions.
This piece does not serve as a roadmap but rather as a guide, offering context and general insights about the terrain. The delineation of precise needs for achieving such extensive resilience and security objectives falls outside the purview of this write-up, but an overview of the landscape itself proves valuable in this era of reflective conversations. Stay tuned.
What prompts Sophos to utilize kernel drivers?
Similar to other cybersecurity entities, Sophos harmonizes with the underlying Windows structure through a blend of methods, some of which delve deeply into the system internals: kernel drivers, user-space hooking, and other approaches. Each security provider has its unique approach to this. At Sophos, we have previously disclosed details about our procedures, but broadly speaking, the system access enabled by kernel drivers is crucial in delivering the security functionalities expected by users of a contemporary cybersecurity product. These functionalities encompass:
Insightfulness
- Offering precise and nearly real-time insights into system behavior
Defensive Measures
- Equipping the ability to prevent hazardous or non-compliant actions before they occur, rather than just observing them
- Facilitating swift responses to detected malicious or non-compliant behavior and rectifying or reversing it
Anti-tampering Measures
- Instilling confidence that the security product operates as configured, even when segments of the operating system itself are compromised
Stability / Compatibility
- Instilling assurance that installing the security product does not compromise the stability of the Windows system or third-party software and hardware
Efficiency
- Providing the aforementioned functionalities with a predictable and tolerable impact on overall system performance
Energy Efficiency and Modern Standby
- Extending the aforementioned capabilities into low-power modes; in essence, ensuring that the security product continues to offer insights and protection if any other activity is ongoing
* Other Windows platform capabilities should operate effectively and resolve dependencies dynamically to prevent lockups during low-power modes
Ongoing Sophos Windows drivers
Presently, Sophos boasts five Windows kernel drivers: an ELAM (Early Launch Anti-Malware) driver, two drivers that intercept file and process operations, and two drivers that intercept network operations. Comprehensive details about these kernel drivers have been previously detailed here, hence we shall provide a concise summary herein. To recap:
- The ELAM driver is obligatory for Windows; security vendors must supply an ELAM driver to register as an endpoint-security product (referred to as an AV, following the historical “antivirus” terminology) and deactivate Windows Defender on end-user devices
- The two file drivers offer extensive process monitoring and event recording capabilities not presently available in a Windows API, along with anti-tampering features, process interception, and ransomware deterrent mechanisms
- The two network drivers facilitate web security, packet analysis for intrusion prevention, DNS security, and rerouting of network streams for zero-trust network access
In the subsequent section, we shall briefly touch on how Sophos manages the injection of DLLs into processes in the kernel and user space. For now, we will outline the functions of each of the five drivers, while suggesting that interested readers refer to the aforementioned post for more insights.
SophosEL.sys
SophosEL.sys embodies the ELAM driver. Much like all cybersecurity firms collaborating with Microsoft Windows, Sophos must furnish an ELAM driver to kickstart AM-PPL (Anti-Malware Protected Process Light) services and processes. Only AM-PPL processes are eligible to register as an AV, consequently deactivating Windows Defender on user devices. Besides, AM-PPL processes benefit from inherent protections, such as being immune to termination via the user interface. SophosEL.sys bars blocked drivers from being loaded by the Windows kernel in the early stages of the boot process. Moreover, SophosEL.sys incorporates “fingerprints” of Sophos-specific code-signing certificates, empowering Sophos to execute AM-PPL processes and services.
SophosED.sys
This represents the initial file-systems driver and is the primary Sophos anti-malware driver; with “ED” signifying Endpoint Defense. SophosED.sys handles tasks such as furnishing events to the Sophos System Protection service (SSPService.exe), a combination of synchronous callbacks (SophosED.sys halts activity until SSPService.exe returns a decision) and asynchronous events (SophosED.sys integrates a serialized variant of the event and pertinent parameters into a queue for asynchronous notifications). Additional functionalities managed by this driver include:
- Managing a comprehensive process/thread/module tracking system along with pertinent context
- Logging low-level system activity events to the Sophos event logs for forensics and analysis
- Safeguarding the integrity of Sophos installation and configuration processes through an autonomous authentication mechanism
- Deploying an autonomous validation mechanism for binaries shipped by Sophos
- Injecting SophosED.dll into freshly initiated processes
- Ensuring the execution of our native Sophos application when needed during boot
- Facilitating secure communications between Sophos processes, services, and drivers; ensuring file hashing consistency; and providing support for memory
- scanned
- For instance, Sophos monitors suspicious alterations to documents that could indicate ransomware activity. Ransomware may attempt to avoid detection by encrypting files in place or creating encrypted copies alongside the originals, then either replacing the original with the copy or rewriting the original contents. These alterations can be carried out through regular file writing or memory-mapping for writing. The proposed mechanism would need to provide adequate callbacks for thorough analysis.
- Similarly, establishing a mechanism to monitor events like Registry key creation, deletion, renaming, linking, key/value access, modification, and authorize or block these operations could be beneficial.
- Introducing a mechanism to monitor events such as new driver, hardware, or software installation and validate them during installation (also see unauthorized drivers below) may be suitable. Additionally, providing a way to observe processes connecting to driver devices and control access could be important
- intricate and possibly encompass visibility over constructing device stack and filtering devices and managing processes sending IOCTLs to devices.
- Upon debut, Apple’s endpoint security APIs couldn’t supplant kexts in a live context, hindering the utilization of APIs in production settings and acquiring practical experience
- In contrast to Microsoft’s Canary and Dev channels, new releases reached all Apple Insiders simultaneously
- Apple refrained from sharing detailed roadmaps, suggestions, or developer guidelines for their APIs
- Several crucial endpoint security APIs were introduced late in the beta phase, with reported flaws necessitating retests with every release to ascertain their status
- Apple refrained from providing security vendors with guidance or prior notices on the general OS release schedules for customers
- While Apple offers the option to still leverage kernel APIs, it mandates customers to disable multiple key OS security features concurrently. This motivation led customers and vendors to transition to endpoint security APIs instead of persisting with legacy kernel APIs. An alternate approach of offering a singular “switch” to enable access to those kernel APIs might not have yielded the same impact
driver Alert.sys
The HitmanPro Alert driver is another file-system driver within the collection of our five kernel drivers. It is responsible for enforcing CryptoGuard and has the capability to identify and block large-scale encryption of files by ransomware. Additionally, it inserts hmpalert.dll into newly initiated processes.
sntp.sys
The sntp.sys network-filter driver implements essential network interception features required by Sophos for network filtering. The term “sntp” represents Sophos Network Threat Protection. This driver can filter HTTP and HTTPS web traffic for web security, Data Leakage Prevention (DLP), implementation of acceptable use policies through Sophos web protection, extract and record HTTP or HTTPS web traffic, DNS requests and responses, and general TLS stream activity in Sophos event journals and the Sophos Central data lake, intercept and insert L2 packets to execute Sophos’ IPS (Intrusion Prevention System), and delay outgoing flows for further inspection or cross-system coordination.
SophosZtnaTap.sys
SophosZtnaTap.sys is a secondary network-filter driver which is a Sophos-developed OpenVPN TAP driver. It is used by Sophos for its ZTNA (Zero Trust Network Access) agent. This driver intercepts DNS requests for ZTNA-protected applications, responding with a tunnel IP address, and directs IP traffic to the applications.
Concerning DLL injection
Sophos injects DLLs into processes through a proprietary mechanism implemented in both SophosED.sys and hmpalert.sys. There is currently no supported method in user space or the kernel for requesting DLL injection. The injected DLLs enhance visibility and protect API calls made by applications.
Follow this direction: Steps towards a more secure operation
In the following sections, we will first give a brief summary of the choices made by Sophos in their update and feature release procedures and then discuss (at a broad level) potential ways in which the Windows platform could progress to decrease its reliance on third-party kernel drivers, a goal that seems to be of interest based on discussions.
Secure implementation: Gradual deployment and feature controls
As mentioned earlier, a key area of focus at the Summit was Secure Deployment Practices (SDP). Like Microsoft, Sophos has dedicated significant resources to enhance our software architecture to support gradual software deployments and feature controls. Sophos aims to ensure the safety and reliability of our products, while also providing customers with visibility and control where feasible. Sharing our procedures and insights with Microsoft and industry peers is expected to lead to a comprehensive set of shared practices for the entire Windows community.
In a previous post from earlier this year, Sophos detailed a robust process for introducing new software and features incrementally across their customer base. This process enables quick feature disabling for individual customers, specific software versions, or all users globally. Sophos Central offers customers a unified view and management of software updates and configurations within their organization.
Any security product, whether utilizing its own kernel drivers or features integrated into the Windows platform, necessitates periodic updates that alter system behavior. Such changes should be rolled out gradually to ensure stability and functionality. The discussions on sharing best practices for secure deployment were a highlight at the Summit and an area where collaborative development can greatly enhance customer trust in patches and updates, leading to enhanced internet security for all.
Minimizing dependence on third-party kernel drivers
Here we outline some of the functionalities that Sophos incorporates with kernel drivers at a high level. If the Windows Platform were to advance in ways that minimize the necessity for kernel drivers, the listed functionalities could be beneficial to include.
Evolution is a continual process that will likely require active communication and input from various stakeholders; major changes take time. Implementation of changes will also require careful consideration of potential malicious exploitation. The information provided serves as a starting point for discussions.
While not exhaustive, based on our experience, we highlight eight potential evolutions in this post, presenting an initial description of certain functionalities Sophos considers advantageous. These eight points are intended to stimulate further dialogue and more detailed definitions. We anticipate collaborating with Microsoft to further refine any requirements, preferably through frequent small iterations.
API for authorizing/blocking file and directory access
Providing a supported mechanism for security vendors to review files and directories accessed by processes and control such access could be beneficial for the Windows platform. This could involve receiving notifications on file openings and managing access decisions, along with handling updates to these decisions.
API for authorizing/blocking registry access
Introducing a supported mechanism for security vendors to monitor registry keys and values accessed by processes and regulate such access may be advantageous.
API for managing process behavior
Supporting a mechanism for security vendors to oversee process activities on the system and take appropriate actions could be beneficial for the Windows platform. These functionalities would mirror those provided by the Windows kernel to kernel-mode drivers, with potential additional features. It’s important to note that the information below serves as preliminary guidance and is not exhaustive.
Process Activity Callbacks: A capability to track events like child process initiation, process termination, thread initiation, thread termination, thread context alteration, APC scheduling, image loading, and more, allowing security vendors to permit or block operations.
File Activity Callbacks: A way to monitor events such as attempts to create, open, adjust, or rename files/directories.
API for regulating network access
An updated endpoint protection tactic involves network safeguarding. As a result, it might prove advantageous for the Windows platform to establish a sanctioned method for security providers to comprehensively protect networked devices. This could involve the ability to accept and grant approval for assorted network flows, to analyze and potentially alter the information within the flow, and to carry out these actions before interacting with the destination.
For contemporary zero-trust deployment methods, this might also encompass the ability to intercept and redirect traffic via vendor-specific gateways, to screen and react to DNS queries, to authenticate/approve access to registered applications, and to seize or insert authentication tokens in the redirected traffic. Discussions in this context would naturally also involve measures to prevent misuse of such capabilities.
API for permitting/preventing kernel drivers
It could be beneficial for the Windows platform to introduce an approved mechanism for security providers to hinder unauthorized drivers. Kernel drivers possess the capability to end any process, including AM-PPL security processes, and this is consequently a prevalent strategy employed by malicious campaigns.
It may also be advantageous for the Windows platform to introduce a sanctioned user-space method for security providers to impede local and domain administrators from overriding or undermining the security product’s determinations, aside from, for instance, by permitting the behavior, driver, or application via the security product’s API or user interface.
It could also prove beneficial for the Windows platform to establish a sanctioned mechanism for security providers to access detailed data concerning potential kernel drivers (like filename, driver size, hashes, signatures) and to govern the blocking and loading of kernel drivers.
API for connecting context with kernel objects (processes, files, Registry keys, network connections etc.)
Introducing a sanctioned mechanism for security providers on the Windows platform to uphold an unalterable context regarding kernel objects, such as files and processes, might be advantageous. The context could embrace insights on whether an object is part of Windows, part of a specific security solution, or linked with another product; details on whether the object has been examined, timing of inspection, and the conclusion reached; along with file hashes or other details connected with an object, such as a unique identifier for the object. It could be useful for this context to persist across reboots when relevant.
Dynamic Link Library (DLL) injection or corresponding mechanisms
It could be valuable for the Windows platform to provide a validated method for security providers to infuse DLLs and/or offer functions currently supplied by injected DLLs. Presently, injected DLLs provide both hooking and low-level protection purposes, as outlined earlier.
Hooking: Injected DLLs hook various APIs to report data about API calls from process code, including identification of malicious processes and malware injected into an otherwise legitimate process. Certain API calls also fall under Event Tracing for Windows (ETW), but the data collected via ETW lacks certain parameters necessary for competent protection.
Moreover, ETW operates asynchronously, and it might be advantageous to have a synchronous mechanism. Ideally, a security provider should have jurisdiction over the API calls, level of detail, and whether a specific event is synchronous or asynchronous. For example, it could be beneficial for the Windows platform to introduce a validated method for intercepting syscalls.
Low-level protection: Injected DLLs also deliver detection/protection mechanisms. Various instances entail shielding the hooks from unhooking (by malware), hindering hooking by malware, memory page protection beyond the standard provided by the operating system, identifying efforts to bypass APIs (e.g., using syscalls directly, accessing PEB and linked information directly).
It might also be beneficial for the Windows platform to introduce fresh Windows protection mechanisms, like Windows-endorsed integrity of its own DLLs (e.g., “PatchGuard in user mode”). Another prospect could be Windows-supplied asynchronous (akin to Microsoft Threat Intelligence Secure ETW, which is already in place) and synchronous (new) callbacks about in-process events, including memory allocations, setting thread context, and kernel exception handling — e.g., notifications about exceptions before forwarding to user mode. Undoubtedly, these or comparable mechanisms should be devised with consideration for their impact on system performance.
Guarding against tampering and AM-PPL
Introducing a sanctioned mechanism on the Windows platform for a means to secure security processes from being deactivated, terminated, or uninstalled could prove beneficial. Currently, this duty is carried out by AM-PPL (which in turn demands an ELAM driver) and by the Sophos driver. Without ELAM drivers, security providers necessitate another “root of trust” to enable commencing protected processes.
The safeguard currently provided by AM-PPL is partial, as malevolent actors can still uninstall or tamper with the security product, unless the security product actively safeguards itself (e.g., guarding its binaries and Registry keys). It might be useful for the Windows platform to present a validated mechanism to shield a security product and its diverse components and functionalities, such as files, processes, registry keys, and IPC.
In an ideal scenario, this extra layer of protection could solely be waived by the security product itself (for updating/uninstalling purposes), with some provision for eliminating the security product via alternate methods if required.
And beyond: Mac and Linux
In this concluding segment, we’ll delve into three instances in which the progression of the Windows platform might draw inspiration from how particular issues have been managed on, respectively, Linux and macOS.
Sophos on Linux 1: XDR Insight through eBPF
eBPF serves as a technology to provide in-kernel observability hooks in the Linux kernel; the core of the name initially stood for Berkeley Packet Filter, an early packet-filtering technology, but no longer does. Microsoft has a trial version of eBPF for Windows.
On Linux, Sophos utilizes eBPF probes to monitor process, file, and network activity. These probes accumulate data and conduct basic stateless filtering; user space processes the stream of events and examines the activity.
An essential security aspect of eBPF is the validation process. eBPF programs must comply with various restrictions to be compiled into a bytecode and loaded into the kernel. For example, Linux lacks string pattern-matching functions, and such functions can’t be executed in eBPF bytecode due to restrictions on verifier complexity. Linux eBPF kprobes function in an atomic context and can solely access non-pageable kernel memory.
These constraints might make it challenging for eBPF for Windows to underpin an interface for “authorize/block” in user space as detailed earlier. eBPF for Windows could serve as a solution for dynamically collecting system activity events in the kernel and transmitting them to user space for subsequent analysis.
Sophos on Linux 2: File scanning via fanotify
Since the inception of version 5.1, Linux has featured a fanotify API for intercepting file operations. Sophos initially made use of a Linux kernel driver.
(Mole) to deploy on-access file scanning, yet transitioned to fanotify as an early pioneer (and played a part in developing it into the standard it currently stands). The contemporary Sophos Linux offerings now employ fanotify to gather file events asynchronously, conducting file scans in the background if necessary, and initiating response actions based on the scanning outcomes.
Shifting to fanotify demanded a substantial commitment from Sophos. Various Linux distribution providers rolled out kernels with fanotify compatibility across different release cycles, necessitating Sophos to maintain support for both the Talpa kernel driver and fanotify implementations. Updates to kernels leveraging fanotify needed to filter through the various Linux distributions before Sophos could adopt a consistent interface. Within the Microsoft platform environment, diverse operating system editions are utilized. It could be crucial to factor this in when contemplating alterations to the Windows platform.
Sophos on macOS: Bid Farewell to kexts? A Big Sur-prise
Apple unveiled novel endpoint security APIs a year prior to making their usage obligatory. While Sophos dedicated the year to transitioning from kexts (kernel extensions, on macOS) to the new APIs, customers persisted with the version utilizing kexts, and continued to access OS and security products. The subsequent major macOS release eliminated kernel access for all vendors. Once more, the challenges associated with managing updates for varying operating system editions, and facilitating users to seamlessly update and configure security solutions when they upgrade to new OS versions, should be contemplated. Additionally, we present these retrospective points with the aspiration that they spur a graceful evolution of the Windows endpoint ecosystem, regardless of the route taken:
Conclusion
Transition is challenging. Recent cybersecurity incidents and persistent software trends have underscored that it is also not discretionary. The complete impact of this week’s Microsoft summit might not be fully discernible for months or years; undoubtedly, some of the ensuing changes could be disruptive as only fundamental shifts can be. We ought to also weigh the advantages of Windows natively furnishing an expanded array of OS native security interfaces for the entire endpoint security ecosystem against the monoculture risks of forsaking the robust diversity of proprietary innovations and controls that currently enrich the endpoint security ecosystem. Nevertheless, we are of the opinion that transparency and open communication are the most effective means to expedite improvements for defenders and customers. Let’s commence.
