TL;DR

Repository secret leaks are not edge cases—65% of Forbes AI 50 companies had confirmed credential exposures on GitHub, with a median remediation time of 94 days. Most leaked secrets (71%) tie to web app infrastructure and CI/CD pipelines, creating direct attack paths. This guide provides blue team detection strategies, prevention workflows, and incident response procedures based on 2025-2026 breach data.


Table of Contents


The Magnitude Problem: Who Is Leaking What

Wiz’s November 2025 research analyzed the Forbes AI 50 companies—the most well-funded, security-conscious organizations in tech—and found that 65% had confirmed secret leaks on GitHub. Not “potential” or “false positive.” Confirmed exposures of API keys, tokens, and credentials.

The leaked material wasn’t limited to active repositories. Attackers routinely scan:

  • Deleted forks (material persists in Git history)
  • Gists (treated as “scratch pads” with production credentials)
  • Secondary repositories (POC/test repos with copy-pasted production configs)

What Gets Leaked

Verizon’s 2025 Data Breach Investigations Report (DBIR), covering 22,052 incidents and 12,195 confirmed breaches, breaks down scanner-detected secrets by infrastructure type:

Infrastructure TypePercentageMost Common Secret Type
Web Application Infrastructure39%JWTs (66% of web app secrets)
CI/CD Pipelines32%Service account tokens
Cloud Infrastructure15%Google Cloud API keys (43% of cloud secrets)
Databases5%Connection strings
Other9%Mixed credentials

That’s 71% web app and CI/CD. These aren’t obscure credential types—they’re the backbone of modern development workflows.

The Fortune 500 Isn’t Immune

This isn’t a “small company” problem. When security teams with substantial budgets and full-time AppSec engineers show 65% leak rates, the rest of the industry is statistically worse.

If your posture is “we scan repos, so we’re fine,” this data should unsettle you. Scanning is reactive. By the time your scanner fires, the secret has been committed, pushed, and potentially harvested.


Why 94 Days Is Lethal

Verizon DBIR reports a 94-day median time to remediate leaked GitHub secrets. That’s not time-to-detection. That’s time from detection to remediation—meaning the secret is rotated, scope is validated, and access is revoked.

What Happens in 94 Days

Here’s what attackers accomplish with a single valid credential in that window:

Week 1-2: Reconnaissance

  • Validate credential scope and permissions
  • Map accessible resources (databases, S3 buckets, API endpoints)
  • Establish persistence with secondary access methods
  • Avoid obvious activity that triggers alerts

Week 3-4: Lateral Movement

  • Use compromised service account to access adjacent systems
  • Pivot through API integrations
  • Enumerate additional credentials stored in configuration

Week 5-12: Data Exfiltration

  • Slow, gradual data pulls to avoid anomaly detection
  • Stage data in attacker-controlled cloud storage (often legitimate services like Google Drive)
  • Catalog sensitive data for later monetization

Week 13+: Decision Point

  • Ransom demand (if target appears capable of paying)
  • Silent long-term access (APT-style persistence)
  • Credential resale on dark web markets

The Real Cost

ReliaQuest’s 2025 Annual Cyber-Threat Report shows that 80% of breaches involved data exfiltration. In cases with confirmed exfiltration:

  • 60% used mainstream cloud storage (Google Drive, Mega, Amazon S3)
  • 40% used C2 infrastructure

The use of legitimate cloud services is deliberate. It’s difficult to block Google Drive or S3 at the network edge without breaking legitimate workflows. Attackers know this.


Where Secrets Hide (And Attackers Look First)

Secret leaks follow predictable patterns. Attackers use automated scanners that search these locations systematically:

Primary Targets

1. Commit History

  • Secrets removed in later commits remain in Git history
  • Rewriting history doesn’t help if forks exist
  • Public repositories preserve history forever

2. Pull Request Discussions

  • Developers post configuration snippets for troubleshooting
  • API keys embedded in error messages or debug output
  • Often overlooked during security reviews

3. Issue Tracker Comments

  • Support requests include credential dumps
  • “Here’s my config file, why isn’t this working?”
  • Issues remain public even after repository is deleted

4. Gists and Snippets

  • Treated as temporary but indexed by search engines
  • No organizational oversight
  • Often contain production credentials from debugging sessions

5. GitHub Actions Logs

  • Build logs may echo environment variables
  • Secrets printed during failed deployments
  • Accessible to anyone with repository read access

Secondary Targets

Deleted Repositories

  • GitHub preserves forks even after parent is deleted
  • Forks maintain complete commit history
  • Attackers specifically search for “[original-repo]-fork” patterns

Dependency Files

  • Hardcoded credentials in package.json, requirements.txt, Gemfile
  • Base64-encoded secrets (easily decoded)
  • Environment files (.env, config.yaml) accidentally committed

Documentation

  • Setup guides with example credentials that are actually production keys
  • Architecture diagrams with credential paths
  • Runbooks with embedded service account tokens

Detection Engineering: Find Secrets Before Attackers Do

Scanning is necessary but insufficient. Here’s a layered detection strategy based on how attackers actually operate.

Layer 1: Pre-Commit Scanning

Block secrets before they enter Git history:

Tool: git-secrets or Talisman

# Install git-secrets globally
git secrets --install ~/.git-templates/git-secrets
git config --global init.templateDir ~/.git-templates/git-secrets

# Add AWS patterns
git secrets --register-aws --global

# Add custom patterns
git secrets --add --global 'AKIA[0-9A-Z]{16}'  # AWS Access Key
git secrets --add --global '[0-9a-f]{40}'      # Generic API key

Why this matters: Prevention at commit time is the only control that avoids remediation cost. Once a secret enters history, you’re in incident response mode.

Layer 2: Repository Scanning

Detect secrets in existing repositories:

TruffleHog v3 Configuration

# Scan entire repository history
trufflehog github --repo https://github.com/org/repo \
  --only-verified \
  --json

# High-confidence results only with specific detectors
trufflehog github --repo https://github.com/org/repo \
  --only-verified \
  --json \
  --filter-detectors="aws,github,slack,stripe"

# Scan entire organization including archived repos
trufflehog github --org your-org \
  --only-verified \
  --json \
  --include-archived

Critical: Use --only-verified flag. This tests detected credentials against live APIs to confirm validity. A valid credential is not a false positive.

Layer 3: Attack Surface Monitoring

Monitor for leaked secrets outside your control:

What to Monitor:

  • GitHub’s public search API for your organization’s domain patterns
  • Pastebin, GitLab Snippets, Bitbucket public repos
  • Docker Hub public images (often contain embedded credentials)
  • Stack Overflow and developer forums

Detection Query Example:

# Search GitHub for potential credential patterns mentioning your org
curl -H "Authorization: Bearer GITHUB_TOKEN" \
  "https://api.github.com/search/code?q=yourorg.com+password"

# Search for AWS keys mentioning your org
curl -H "Authorization: Bearer GITHUB_TOKEN" \
  "https://api.github.com/search/code?q=AKIA+yourorg"

Layer 4: Behavioral Detection

Credential usage anomalies often indicate compromise:

Red Flags:

  • API key used from unexpected geographic regions
  • Service account authentication outside normal business hours
  • Sudden spike in API call volume
  • Access to resources never previously touched
  • Multiple failed authentication attempts followed by success

SIEM Query (Splunk Example):

index=cloudtrail eventName=AssumeRole
| stats count by sourceIPAddress, userAgent, awsRegion
| where count > 100
| where awsRegion!="us-east-1" AND awsRegion!="eu-west-1"

This detects API keys used at abnormal volumes from unexpected regions—common attacker behavior after harvesting credentials from repositories.


Prevention Architecture: Stop Secrets at Commit Time

Detection is reactive. Prevention requires architectural changes to how credentials are handled.

Principle 1: Secrets Never Enter Code

Use Secret Management Services:

  • AWS Secrets Manager / Azure Key Vault / GCP Secret Manager for cloud
  • HashiCorp Vault for on-premise or hybrid
  • Doppler / Infisical for development environment management

Implementation Pattern:

# ❌ WRONG: Hardcoded credential
api_key = "sk_live_51Hx..."

# ✅ CORRECT: Runtime secret retrieval
import boto3
secrets_client = boto3.client('secretsmanager')
response = secrets_client.get_secret_value(SecretId='prod/api-key')
api_key = response['SecretString']

Principle 2: Short-Lived Credentials

Static credentials with no expiration are persistent attack vectors.

Implement:

  • AWS STS AssumeRole with 1-hour session tokens
  • GitHub Actions OIDC tokens (no long-lived secrets needed)
  • OAuth refresh tokens with 15-minute access token expiry

Example: GitHub Actions OIDC (No Secrets Required):

name: Deploy
on: push

permissions:
  id-token: write  # OIDC token generation
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/GitHubActions
          aws-region: us-east-1
      # No AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY needed

Principle 3: Least Privilege by Default

Credential Scoping Checklist:

  • Service account can only access required resources
  • API key has narrowest possible scope (read-only when possible)
  • Database user has minimal table-level permissions
  • Cloud role uses explicit resource ARNs, not wildcards
  • CI/CD service account cannot modify production infrastructure

Example: AWS IAM Policy (Scoped):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["s3:PutObject", "s3:GetObject"],
    "Resource": "arn:aws:s3:::specific-bucket/specific-prefix/*"
  }]
}

Compare to overprivileged pattern:

{
  "Effect": "Allow",
  "Action": "s3:*",
  "Resource": "*"  // ❌ Full account access
}

Principle 4: Secret Rotation Automation

Manual rotation fails. Automate it.

Rotation Cadence:

  • Service accounts: 90 days maximum
  • API keys: 60 days
  • Database passwords: 30 days
  • CI/CD tokens: 14 days

Automation Tools:

  • AWS Secrets Manager has built-in rotation for RDS, Redshift, DocumentDB
  • HashiCorp Vault supports dynamic secret generation with TTLs
  • Custom rotation: Use AWS Lambda or Azure Functions triggered by scheduled events

Incident Response: The First 4 Hours

Despite prevention efforts, leaks happen. Speed of response determines blast radius.

Hour 1: Immediate Actions

Minute 0-15: Rotate Compromised Credential

  1. Generate new credential in secret manager
  2. Update production services to use new credential
  3. Revoke old credential immediately
  4. Do not wait to assess scope—assume compromise

Minute 15-30: Kill Active Sessions

  • AWS: Revoke STS sessions via IAM policy update
  • Google Cloud: Revoke service account keys
  • GitHub: Regenerate personal access tokens
  • Database: Kill active connections (SELECT pg_terminate_backend(pid))

Minute 30-60: Lock Down Blast Radius

  • Identify all resources accessible via leaked credential
  • Apply temporary deny-all policies to compromised service account
  • Enable enhanced logging on affected resources
  • Notify incident response team

Hour 2-3: Scope Assessment

Query CloudTrail / Cloud Audit Logs:

# AWS: Find all API calls using compromised access key
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=AKIA... \
  --max-results 1000 \
  --start-time 2026-01-01T00:00:00Z

# Focus on:
# - Data access (GetObject, DescribeTable, Query)
# - Privilege escalation (AttachUserPolicy, CreateAccessKey)
# - Persistence (CreateUser, CreateRole)

Critical Questions:

  • What resources were accessed?
  • Were any resources modified or created?
  • Did attacker establish persistence mechanisms?
  • Was data exfiltrated? (Look for large data transfer volumes)

Hour 3-4: Remediation

If Data Access Occurred:

  • Inventory accessed data (databases, S3 buckets, API endpoints)
  • Determine data classification (PII, PHI, financial, intellectual property)
  • Assess regulatory notification requirements (GDPR, CCPA, HIPAA)
  • Preserve logs for forensic analysis

If Persistence Detected:

  • Enumerate all IAM users/roles created by compromised credential
  • Check for new access keys, SSH keys, OAuth applications
  • Review newly created Lambda functions or automation
  • Scan for backdoored code commits if repository write access existed

Communication:

  • Notify security leadership
  • Prepare incident report for legal/compliance
  • If data breach confirmed, engage incident response retainer

The Non-Human Identity Problem

Cloud Security Alliance’s 2025 State of SaaS Security Report found that 46% of organizations struggle to monitor non-human identities. This is the root cause of prolonged secret exposure.

What Are Non-Human Identities?

  • Service accounts
  • API keys
  • OAuth applications
  • CI/CD pipeline credentials
  • Bot accounts
  • Machine-to-machine authentication tokens

Why They’re Dangerous

Human accounts have clear owners, get offboarded when employees leave, and trigger MFA prompts during suspicious activity.

Non-human accounts:

  • No MFA (often)
  • No clear ownership (which team manages this service account?)
  • Long-lived or never-expiring credentials
  • Overprivileged by default (“just give it admin to make it work”)

Fixing the Gap

Implement Non-Human Identity Management:

  1. Inventory all service accounts (AWS IAM, GCP service accounts, Azure service principals)
  2. Assign ownership (which team/person is responsible for each)
  3. Enforce TTLs (no credentials older than 90 days)
  4. Require justification (why does this service account need this permission?)
  5. Audit quarterly (remove unused service accounts)

Tool: AWS IAM Access Analyzer

# Find unused service accounts
aws accessanalyzer list-analyzers
aws accessanalyzer get-finding --analyzer-arn "arn:..." --id "..."

# Check last used date for all IAM users
aws iam get-credential-report

Summary

Key Findings:

  • 65% of Forbes AI 50 companies leaked credentials on GitHub—this is not a fringe problem
  • 94-day median remediation time gives attackers a full quarter to exploit leaked secrets
  • 71% of leaks involve web app infrastructure and CI/CD pipelines, creating direct production access
  • 46% of organizations cannot effectively monitor non-human identities

Defensive Actions:

  1. Pre-commit scanning (git-secrets, Talisman) blocks secrets before they enter history
  2. Repository scanning (TruffleHog, GitGuardian) detects existing exposures
  3. Secret managers (AWS Secrets Manager, Vault) eliminate hardcoded credentials
  4. Short-lived tokens (STS, OIDC) reduce blast radius of compromised credentials
  5. Automated rotation (90-day max for service accounts) limits exposure window
  6. Non-human identity governance ensures service accounts have owners and expiration

Incident Response Checklist:

  • Rotate compromised credential within 15 minutes
  • Kill active sessions immediately
  • Query cloud audit logs for attacker activity
  • Assess data access and privilege escalation
  • Remove attacker persistence mechanisms
  • Preserve logs for forensic analysis

Repository secret leaks are not inevitable. They’re a process failure. Fix the process.


Sources

  1. Wiz - State of AI in the Cloud Report (November 2025)

  2. Verizon - 2025 Data Breach Investigations Report (2025)

  3. ReliaQuest - Annual Cyber-Threat Report 2025 (2025)

  4. Cloud Security Alliance - State of SaaS Security Report 2025 (2025)

  5. OWASP - Secrets Management Cheat Sheet (2024)

  6. GitHub - Secret Scanning Documentation (2025)

  7. AWS - Secrets Manager Best Practices (2025)

  8. NIST - SP 800-63B Digital Identity Guidelines (2024)

  9. TruffleHog - Secret Detection Documentation (2025)

  10. GitGuardian - State of Secrets Sprawl 2025 (2025)

  11. HashiCorp Vault - Dynamic Secrets Guide (2025)

  12. MITRE ATT&CK - Valid Accounts (T1078) (2025)


  1. TruffleHog v3 - Secret Scanner - Open-source credential scanner with API verification

  2. git-secrets - Pre-commit Hook - AWS tool to prevent secrets in Git commits

  3. GitGuardian - Repository Scanning - Commercial secrets detection with developer remediation workflow

  4. AWS Secrets Manager - Managed secret storage and rotation service

  5. HashiCorp Vault - Self-hosted secrets management and dynamic credential generation

  6. Doppler SecretOps Platform - Developer-friendly secrets management for applications

  7. GitHub Actions OIDC Guide - Keyless authentication for CI/CD

  8. AWS IAM Access Analyzer - Identify unused credentials and overprivileged access

  9. Semgrep - Code Security Scanning - Static analysis with custom rules for secret patterns

  10. Infisical - Open Source Secrets Management - Self-hosted alternative to commercial solutions