Hunting Sensitive Data with Veil Framework
Data discovery — the process of identifying and locating sensitive information across an environment — is a critical phase of both legitimate security assessments and real-world attacks. Understanding how attackers find sensitive data is the foundation for building detection capabilities that catch this activity before exfiltration occurs.
This page covers the techniques used for data discovery in lab environments, the telemetry that reveals this activity, and the detection strategies that defensive teams should implement.
What "Hunting Sensitive Data" Means in Practice
In a penetration test or purple team exercise, data discovery typically involves searching file shares, databases, email, and document repositories for information that would have business impact if compromised. The point is not to steal data — it is to demonstrate that the data is accessible and that the access was (or was not) detected.
Common target data types:
- Credentials stored in files (scripts, configuration files, notes)
- Financial records and PII
- Intellectual property and trade secrets
- Infrastructure documentation (network diagrams, architecture documents)
- Backup files containing database dumps
Detection Opportunities
Data discovery generates telemetry at multiple levels:
File Access Monitoring
When a user or process accesses file shares and reads files at scale, several monitoring points activate:
- Event ID 5145 — Network share access with object-level detail
- Event ID 4663 — File access audit events (requires object access auditing)
- File integrity monitoring — Unexpected reads on sensitive directories
Behavioral Patterns
Data discovery produces distinctive behavioral patterns:
- A single account accessing dozens of shares in a short window
- Sequential directory traversal across network shares
- Large-volume file reads without corresponding business activity
- Keyword searches against file contents (detectable through endpoint telemetry)
Network-Level Indicators
- High-volume SMB traffic from a single source to multiple file servers
- Unusual data transfer volumes from endpoints that typically generate little network traffic
- After-hours file access patterns
Building Detection Rules
Effective data discovery detection combines multiple indicators:
- Baseline normal access patterns — Understand what legitimate file access looks like for each user role
- Set thresholds — Define what constitutes anomalous file access volume
- Correlate events — A single share access is not suspicious; accessing twenty shares in an hour may be
- Alert on sensitive directories — High-value data locations should have tighter monitoring
Lab Exercise Design
When using the Veil Framework to test data discovery detection:
- Pre-stage sensitive data in known locations across your lab environment
- Enable all relevant audit logging on file servers and endpoints
- Run data discovery techniques from a test account
- Verify that your SIEM generates appropriate alerts
- Document which techniques were detected and which were missed
- Tune detection rules based on findings
The goal is not 100% detection — it is understanding your detection boundaries and making informed decisions about acceptable risk.
Related
- Hunting Users — Detecting user enumeration
- PowerView Usage Guide — AD enumeration techniques
- Framework Overview — Architecture context
- Guides — All available guides