Integrating Red with Guard
Lakera Red and Lakera Guard work together to provide comprehensive AI security. Red identifies vulnerabilities through offensive testing, while Guard provides continuous real-time protection. This guide explains how to use Red findings to optimize your Guard deployment.
The Red-Guard Security Lifecycle
Mapping Red Findings to Guard Defenses
Security Category → Prompt Defense
If Red found vulnerabilities in the Security category (instruction override, prompt extraction, etc.):
- Enable Prompt Defense in your Guard policy
- Set threshold based on finding severity:
- Critical/High findings → L4 (Paranoid)
- Medium findings → L3 (Strict)
- Low findings → L2 (Balanced)
Safety Category → Content Moderation
If Red found vulnerabilities in the Safety category (harmful content generation):
- Enable Content Moderation in your Guard policy
- Screen both inputs AND outputs
- Configure categories based on your Red findings
Responsible Category → Multiple Defenses
Responsible category findings may require multiple Guard defenses:
Guard Policy Configuration
Based on a typical Red scan, here’s a recommended Guard configuration:
For High Security Findings
For High Safety Findings
Complete Integration Pattern
Monitoring and Iteration
After deploying Guard, monitor for attack patterns:
Set Up Alerts
Configure alerts in the Lakera Dashboard for:
- High volumes of flagged requests from single users
- New attack patterns matching Red findings
- Attempts to exploit specific vulnerabilities
Review Guard Logs
Regularly review to identify:
- Attack trends and patterns
- False positive rates
- New attack vectors not covered by current policies
Tune Thresholds
Start strict and relax if needed:
- Begin with L4 (Paranoid) for defenses where Red found vulnerabilities
- Monitor false positive rates in production
- Adjust to L3 if false positives significantly impact user experience
- Never go below L2 for categories where Red found issues
Continuous Security Cycle
Best Practices
Defense Coverage
Ensure Guard covers all attack surfaces Red tested:
- User inputs (chat messages, form fields)
- External data (RAG sources, API responses, file uploads)
- Model outputs (before displaying to users)
Layered Protection
Use Guard as one layer in defense-in-depth:
- Guard catches known attack patterns
- System prompt hardening provides baseline protection
- Application-level validation handles business logic
- Monitoring detects novel attacks
Using Red’s Compare Feature
After implementing Guard:
- Run a follow-up Red scan
- Use Compare to see before/after results
- Verify risk scores decreased in targeted categories
- Identify any remaining gaps
Getting Started
- Review your Red scan findings
- Map each finding to the appropriate Guard defense
- Create a Guard project and configure your policy
- Integrate Guard into your application (see Quickstart)
- Monitor and tune based on production data
- Schedule periodic Red rescans
Need Help?
- Contact Lakera for help translating Red findings to Guard policies
- Review the Guard documentation for detailed configuration options
- Reach out to support@lakera.ai for technical assistance