Allow and Deny Lists

Allow and deny lists provide a mechanism to quickly override Lakera Guard’s flagging decisions. These lists operate using fuzzy match patterns and do not modify the underlying model behavior.

Overview

Allow and deny lists serve as a temporary solution to handle misclassified prompts that may be disrupting critical workflows. They function as an additional safety layer within your policies:

  • Allow Lists: Override any detector flags for specific content using near-exact matching patterns. When content nearly exactly matches an entry in your allow list, it will pass through without being flagged, regardless of your policy configuration. To override false positives, add the problematic prompt in full as a string to the list.

  • Deny Lists: Force flagging of specific content using fuzzy substring matching. Any content containing a close match from your deny list will be flagged, regardless of detector confidence. To override false negatives, add the specific part of the prompt that’s problematic as a string to the list. Note that for more permanent flagging of problematic strings, custom detectors should be used instead.

Allow lists take precedence over deny lists and all other detectors.

Temporary Solution Only
Overriding Lakera Guard can introduce security loopholes. These lists should only be used as a temporary measure while reporting misclassified prompts to Lakera for robust fixes. They are not intended as a permanent security solution.

Precedence Rules

  1. Allow lists have the highest precedence and will override all other decisions
  2. Deny lists come next and will override standard policy decisions
  3. Standard detectors (prompt defense, content moderation, etc.) have the lowest precedence

This means if a piece of content matches both an allow list and a deny list entry, it will be allowed.

Reporting misclassified requests

Always report misclassified prompts to Lakera for robust model improvements. You can do this directly via the Logs page or by reacing out to support@lakera.ai

Once problematic content has been addressed and the Lakera models updated, you can remove the relevant override list entries.

Implementation

The implementation details differ between SaaS and self-hosted deployments:

  • For SaaS customers: Configure allow and deny lists through editing policies in the platform interface
  • For Self-hosted customers: Configure through policy files using the override_allow and override_deny detector types

For specific implementation details, see: