How to Detect Shadow AI: Technical Guide for SOC Teams, KYDE

Shadow AI, AI systems running in production that aren't formally tracked in your IT asset inventory or governed by your security policies.

Most enterprises have 5-50x more AI systems running than their security team knows about. Shadow AI represents a significant compliance and security risk:

Compliance Risk: Ungoverned High-Risk AI applications violate EU AI Act, DORA, and similar regulations
Security Risk: Unvetted AI systems may process sensitive data, make critical business decisions, or exfiltrate information
Audit Risk: You cannot prove what unscoped AI systems are doing

What Signals Indicate Shadow AI?

Shadow AI most commonly manifests as HTTPS traffic to Large Language Model (LLM) API providers from systems you didn't scope.

LLM Providers to Monitor

These are the primary cloud-based LLM vendors. Traffic to these providers, from internal systems, is your first detection vector:

Provider	API Domain	Detection Confidence
OpenAI	api.openai.com	Very High
Anthropic	api.anthropic.com	Very High
Azure OpenAI	*.openai.azure.com	Very High
Google Gemini	generativelanguage.googleapis.com	High
Cohere	api.cohere.ai	High

Secondary Signals: API Behavior Patterns

Once you identify traffic to LLM providers, these patterns help confirm it's an active AI system:

High-volume requests: 100+ requests/hour from single IP = likely automated agent
Off-hours consistency: Requests at 2-5 AM on schedule = scheduled agent
Large payloads: Request bodies >10KB = sending documents/data to LLM
Backend/database calls: LLM calls from servers, not web frontends = ungoverned business logic
Cloud function calls: Lambda/Cloud Run making LLM calls = serverless automation

Network Detection Rules

Rule 1: DNS Queries to LLM Providers (Easiest Starting Point)

What to monitor: DNS queries to LLM provider domains

Why this works: Before any HTTPS connection, the system must resolve the domain via DNS. DNS logs are easier to collect and parse than full packet capture.

SIEM Rule (Splunk)

sourcetype="dns"
| search query="api.openai.com" OR query="api.anthropic.com"
  OR query="generativelanguage.googleapis.com"
| stats count, values(src_ip), values(query) by src_ip
| where count > 5

SIEM Rule (Zeek/Elasticsearch)

See documentation for Elasticsearch query DSL configuration for LLM provider detection.

Rule 2: HTTPS Traffic with LLM-Specific Headers

What to monitor: HTTPS connections with LLM SDK User-Agent or API authentication headers

Detection Method: If you have TLS inspection or HTTP proxy logs, look for:

User-Agent contains: openai, anthropic, cohere, mistral, huggingface

OR

Header contains: X-API-Key, Authorization (Bearer pattern)

Rule 3: Asset Inventory Gap Detection

What to monitor: Outbound HTTPS traffic from internal systems to LLM providers, compared against your scoped software inventory.

Why this works: Finds shadow AI even if the provider is unknown.

Process

Query firewall logs: destination = LLM provider domains
Extract: source_ip, source_hostname, destination, timestamp
Query asset inventory: get all approved systems
Find gaps: hosts in logs NOT in inventory = shadow AI candidates

Behavioral Detection Rules

Rule 4: Volumetric Anomaly (Agent Behavior)

What to monitor: Sustained high-volume requests from a single source to an LLM provider

Why this works: Humans rarely make 1000+ API calls to an LLM in a day. Agents do.

Monitor LLM provider traffic per source IP.
Flag if:
  - More than 100 requests/hour to a single LLM provider
  - More than 500 requests/day
  - Requests sustained over multiple hours

Rule 5: Time-Based Anomaly (Scheduled Agent)

What to monitor: Requests to LLM APIs at unusual times

Why this works: Human developers don't test at 3 AM. Scheduled agents do.

Detection Workflow

Week 1: Network Capture
- Enable full packet capture on egress (or log all DNS/HTTPS to LLM domains)
- Set 7-day baseline

Week 2: Analysis
- Parse logs, identify all LLM API calls
- Correlate source IPs/hostnames to asset inventory
- Catalog by system, department, usage pattern

Week 3: Validation
- Cross-check with IT asset inventory
- Interview IT: "Is this system scoped?"
- Identify gaps (shadow AI candidates)

Week 4: Report & Conversation
- Prepare findings for CISO conversation
- Quantify: "We found X systems calling LLMs"
- Prepare recommendations

Implementation Roadmap

Phase 1: Quick Wins (Week 1)

Enable DNS logging to LLM provider domains
Add SIEM rule for DNS queries to api.openai.com, api.anthropic.com
Create alert for any matches

Phase 2: Inventory Baseline (Week 2-3)

Export last 30 days of HTTPS traffic to LLM providers
Cross-reference against CMDB/asset inventory
Document all legitimate systems that call LLM APIs
Create whitelist for known approved systems

Phase 3: Behavioral Rules (Week 3-4)

Deploy volumetric anomaly detection (>100 requests/hour)
Deploy time-based anomaly detection (off-hours consistency)
Set up alerting thresholds

Detection Complete. What's Next?

You've found shadow AI. Now you need to classify it (is it High-Risk?) and govern it (can you prove what it does?).

Next: Classify AI Systems →