Allow and Deny Lists
Allow and deny lists provide a mechanism to quickly override Lakera Guard’s flagging decisions. These lists operate using fuzzy match patterns and do not modify the underlying model behavior.
Overview
Allow and deny lists serve as a temporary solution to handle misclassified prompts that may be disrupting critical workflows. They function as an additional safety layer within your policies:
-
Allow Lists: Override any detector flags for specific content using near-exact matching patterns. When content nearly exactly matches an entry in your allow list, it will pass through without being flagged, regardless of your policy configuration. To override false positives, add the problematic prompt in full as a string to the list.
-
Deny Lists: Force flagging of specific content using fuzzy substring matching. Any content containing a close match from your deny list will be flagged, regardless of detector confidence. To override false negatives, add the specific part of the prompt that’s problematic as a string to the list. Note that for more permanent flagging of problematic strings, custom detectors should be used instead.
Allow lists take precedence over deny lists and all other detectors.
Temporary Solution Only
Overriding Lakera Guard can introduce security loopholes. These lists should only be used as a temporary measure while reporting misclassified prompts to Lakera for robust fixes. They are not intended as a permanent security solution.
Precedence Rules
- Allow lists have the highest precedence and will override all other decisions
- Deny lists come next and will override standard policy decisions
- Standard detectors (prompt defense, content moderation, etc.) have the lowest precedence
This means if a piece of content matches both an allow list and a deny list entry, it will be allowed.
Reporting misclassified requests
Always report misclassified prompts to Lakera for robust model improvements. You can do this directly via the Logs page or by reacing out to support@lakera.ai
Once problematic content has been addressed and the Lakera models updated, you can remove the relevant override list entries.
Implementation
The implementation details differ between SaaS and self-hosted deployments:
- For SaaS customers: Configure allow and deny lists through editing policies in the platform interface
- For Self-hosted customers: Configure through policy files using the
override_allow
andoverride_deny
detector types
For specific implementation details, see: