Investigators Detect Over 20 Weaknesses in Supply Chain of MLOps Platforms
A group of cybersecurity experts have cautioned about the risks to security present within the supply chain of machine learning (ML) software, following the uncovering of more than two dozen vulnerabilities that could be leveraged against MLOps platforms.
These vulnerabilities, described as inherent and implementation-based deficiencies, could result in severe outcomes, extending from arbitrary code execution to the loading of malicious datasets.
MLOps platforms provide the capability to develop and execute a machine learning model pipeline, with a model registry functioning as a repository dedicated to storing and versioning trained ML models. These models can then be integrated into an application or enable other users to query them via an API (also known as model-as-a-service).
“Inherent vulnerabilities stem from the fundamental formats and procedures utilized in the targeted technology,” stated researchers from JFrog in a detailed report.
Instances of inherent vulnerabilities involve exploiting ML models to execute customized code by capitalizing on the automatic code execution functionality during loading (e.g., Pickle model files).
Similar behavior is also applicable to specific dataset formats and libraries, allowing for automated code execution, potentially creating an avenue for malware attacks merely by loading a freely available dataset.
Another example of inherent vulnerability involves JupyterLab (formerly Jupyter Notebook), a web-based interactive computational environment allowing users to execute code blocks and view corresponding results.
“A pertinent issue that often goes unrecognized is the handling of HTML output when executing code blocks in Jupyter,” highlighted the researchers. “The Python code output can emit HTML and [JavaScript] that gets interpreted by the browser.”
The issue arises when the JavaScript outcome is executed without being segregated from the primary web application, potentially enabling the web application to execute arbitrary Python code.
In essence, malevolent parties could input malicious JavaScript intended to insert a new cell in the ongoing JupyterLab notebook, embed Python code within it, and subsequently execute it. This scenario becomes especially relevant in exploiting a cross-site scripting (XSS) vulnerability.
To tackle this, JFrog noted the identification of an XSS weakness in MLFlow (CVE-2024-27132, CVSS score: 7.5), stemming from inadequate validation during the execution of an untrusted recipe, resulting in client-side code execution within JupyterLab.
“A key takeaway from our investigation is that all XSS vulnerabilities in ML libraries should be treated as potential avenues for arbitrary code execution, considering that data scientists might utilize these ML libraries with Jupyter Notebook,” emphasized the researchers.
The second set of weaknesses pertain to implementation lapses, such as the absence of authentication in MLOps platforms, potentially allowing a threat actor with network access to exploit code execution abilities by misusing the ML Pipeline feature.
These threats have materialized, with financially incentivized adversaries exploiting such gaps, as demonstrated in the scenario involving the unpatched Anyscale Ray (CVE-2023-48022, CVSS score: 9.8), to orchestrate cryptocurrency mining activities.
An alternative form of implementation vulnerability is a container escape aimed at Seldon Core, enabling attackers to progress from code execution to lateral movement across the cloud environment and reach other users’ models and datasets by uploading a malevolent model to the inference server.
The cumulative impact of leveraging these vulnerabilities is that they could not only be weaponized to infiltrate and propagate within an organization but also compromise servers.
“When deploying a platform that facilitates model serving, it’s imperative to acknowledge that anyone able to serve a new model can also execute arbitrary code on that server,” cautioned the researchers. “Ensure that the environment responsible for model execution is completely isolated and fortified against container escapes.”

This disclosure coincides with Palo Alto Networks Unit 42 outlining two now-fixed vulnerabilities in the open-source LangChain generative AI framework (CVE-2023-46229 and CVE-2023-44467) that could have facilitated the execution of arbitrary code and unauthorized access to sensitive data, respectively.
Recently, Trail of Bits also disclosed four issues in Ask Astro, an open-source chatbot application utilizing retrieval augmented generation (RAG), that could result in chatbot output manipulation, inaccurate document ingestion, and potential denial-of-service (DoS) attacks.
Just as security vulnerabilities are being unveiled in AI-powered applications, methods are also being developed to corrupt training datasets with the aim of deceiving large language models (LLMs) into generating compromised code.
“Unlike recent attacks that embed malicious content in ascertainable or irrelevant sections of the code (e.g., comments), CodeBreaker employs LLMs (such as GPT-4) for sophisticated payload manipulation (without impacting functionalities), ensuring that both the tainted data for fine-tuning and output code can dodge robust vulnerability scans,” detailed academics from the University of Connecticut revealed.


