Data Exfiltration
Detects attempts to extract system prompts and hidden data
API Field:data_exfiltration_enabledOverview
Data Exfiltration Detection identifies attempts to extract confidential information from your AI system, including system prompts, training data, internal knowledge bases, and other protected information that should remain hidden.
What It Detects
- System prompt extraction attempts
- Training data fishing
- Internal knowledge base probing
- Configuration extraction
- Memory/context manipulation
- Indirect prompt leakage
- Model inversion attacks
Why It Matters
Your system prompts often contain business logic, safety instructions, and proprietary information. Extraction can lead to competitive disadvantage, security bypass, and exposure of sensitive business processes.
Technical Details
Risk Score Range
0.0 - 1.0 (High risk: > 0.6)
Confidence Level
Typically 0.82 - 0.95
Processing Time
< 70ms per scan
Common Use Cases
Detection Examples
Direct attempt to extract system prompt.
Indirect extraction using repetition commands.
Obfuscated extraction using translation as a bypass.
API Usage
Enable this scanner in your API request by setting data_exfiltration_enabled to true in your API key settings, or include it in your request:
curl -X POST https://benguard.io/api/v1/scan \
-H "X-API-Key: ben_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Your user input here"
}'The scanner settings are configured per API key in your dashboard under Settings → Scanner Configuration.
Response Format
When this scanner detects a threat, the response will include:
{
"is_valid": false,
"status": "threat_detected",
"risk_score": 0.92,
"threat_types": ["data_exfiltration"],
"details": {
"results": [
{
"scanner": "data_exfiltration",
"threat_detected": true,
"risk_score": 0.92,
"confidence": 0.92,
"details": {
"reason": "Direct attempt to extract system prompt.",
"evidence": ["detected pattern in input"]
}
}
]
},
"request_id": "req_abc123"
}Best Practices
- Design system prompts assuming they may be exposed
- Avoid putting secrets in system prompts
- Implement prompt protection instructions
- Monitor for successful extraction attempts
- Use layered prompting strategies
Related Scanners
Consider enabling these related scanners for comprehensive protection: