Lakera Guard Integration Guide
Lakera Guard functions as a control layer around a model, assistant, or agent(s). Whether deployed as a self-hosted container or leveraged via the SaaS service, Lakera Guard fits seamlessly into a wide range of use cases and architectures.
Integrating Lakera Guard
Integrating Lakera Guard is as simple as making an API call for each LLM interaction, passing the inputs (both user and reference documents) and LLM outputs for screening. Lakera Guard will respond flagging whether the interaction contains a threat or policy violation.
Based on the Guard flagging response you can define control flows flexibly in your applications or integrations to prevent and respond to threats in real-time.
For advice on integration best practices and common pitfalls to avoid see the Integration recommendations section.
Designing Control Flows
Lakera Guard gives you flexibility to choose how to respond to flagged interactions. This provides full control over designing workflows based on your desired behavior and user flows.
Lakera Guard provides a boolean response of flagging
equals true
or false
. Control over the guardrails used for screening and the flagging sensitivity are controlled via the policy.
Many Guard users opt to block threats in real-time, preventing the potentially compromised output being returned to the user or application. Some Guard users opt to configure more flexible workflows, enabling users to override flagged interactions or setting up escalations which kick in after a certain number of interactions have been flagged for a user.
When first integrating Lakera Guard, you can choose to use a pure monitoring strategy to begin with. This simply means integrating Lakera Guard without creating any flows to take action based on flagged responses. This approach allows you to monitor Guard’s performance, configure policies correctly, and identify threats and vulnerabilities for post-hoc investigation and response.
Sample use cases
Sample Use Case: GenAI Chat Application
Generative chat applications are a popular enterprise use case for Lakera Guard. First, consider the data flow of a chat system that does not leverage security controls for managing model input and output.

In this basic implementation, data flows from the user to the model back to the user. Security is dependent on the model’s ability to handle malicious input and control its own output.
This implementation poses several risks, including malicious prompts such as prompt injections or jailbreaks entering the application. It’s also possible for sensitive data, like PII, to enter the model. Depending on compliance requirements, this may pose additional risks. Additionally, there is concern the model may provide the user with undesirable responses, including hate speech or sexual content. Relying on the foundation model developer to address these risks comprehensively is not optimal. Updates to the model can introduce behavioral changes. There’s also potential for creating lock-in conditions which would make using multiple models or switching providers difficult.
Lakera Guard Implementation
Lakera Guard protects against these risks but is abstracted from the model itself. In the generative chat system, we recommend screening the interaction holistically, screening both inputs and output after the LLM has responded and before returning the response to the user. This optimises for latency as there’s only one screening request and provides Lakera with the full context for higher accuracy.
Alternatively, if you have concerns with data leakage to third party LLM providers, you can send the user input to the Lakera Guard API for screening before passing the prompt to the model and then screen the whole interaction before returning the output to the user.
A common policy configuration for chatbots is to screen the user input for prompt attacks, sensitive user PII, and content violations to check for malicious attacks, data leakage and user misbehavior. On the model output, it’s common to screen for content violations, data leakage, and unknown links in the model response to check for model misbehavior, data exfiltration and phishing links. Note that all guardrails are configurable via the policy.

In the diagram above, the GenAI Chat application is secured with Lakera Guard by making an API call containing the user input and an API call containing the model output. In doing so, a control set has been created to enforce what enters and leaves the model without relying on the model itself.
Sample Use Case: Screening Documents
Screening documents, files or other reference content with Lakera Guard works similarly to user input. Consider a general data flow of a document being passed as context to a model. The first requirement is to handle the document upload and parse it into text. Once parsed, the control flow follows the same structure.

Lakera Guard has a large context window, in line with major models, and does smart chunking internally. When dealing with large documents that exceed model or Lakera context limits, the document can be parsed and chunked, sending smaller parallelized requests to Lakera Guard. Baselining latency helps identify the optimal chunking size and performance trade-off.

Sample Use Case: RAG Architecture
GenAI Applications utilising Retrieval Augmented Generation (RAG) and scaled-out knowledge bases can leverage Lakera Guard as well. This helps to extend protection against documents which may contain poisoned or sensitive data, either from the user or malicious actors targeting the user.
The following diagram shows a question-answer RAG generation pattern, but is applicable for other RAG use cases.

Lakera Guard Implementation
The Lakera Guard integration for RAG works similar to chat applications but with both the user and document inputs being passed as multiple inputs to Guard and screened within the same single Guard request.
For RAG setups where the reference content is pre-set or relatively static then it’s recommended to screen documents directly during the initial upload to identify poisoned documents ahead of time rather than during the user interaction. This can be done following the document integration approach outlined above.
Despite increased architectural complexity, the implementation pattern remains the same. Contained in the API request call to Lakera Guard is the user input and retrieved context. The input is evaluated in its entirety.

Sample Use Case: AI Gateway
Lakera Guard integrates seamlessly within an AI Gateway, providing a centralized access point for managing and securing AI services. This integration ensures consistent control enforcement across all AI interactions. Organizations benefit from this setup through improved efficiency, enhanced observability, and streamlined operations.

Integration recommendations
We recommend following these recommendations in order to avoid common pitfalls:
Set up a project and assign a relevant policy
The Lakera Default Policy has our strictest flagging sensitivity so will flag anything Guard isn’t confident is safe. Use one of the Lakera recommended policies to assess Guard’s accuracy in real deployments.
Pass system prompts separately and correctly
A common attack vector is for users to insert malicious instructions as if they’re system instructions. Therefore in order to prevent Guard flagging your own system instructions make sure they’re passed in a separate message from LLM inputs and outputs with the message role as system
.
Screen the original raw content from user inputs and reference documents
A common practice is to add additional system instructions or decorators to LLM inputs. Similar to the point above, in order to avoid false positives either screen the exact original input or remove any system instructions added to LLM inputs before passing to Guard as these are likely to be flagged as prompt attacks.
Screen for threats holistically
Lakera Guard performs best when screening interactions holistically with all appropriate guardrails applied. This means passing both LLM inputs and outputs in screening requests and using representative policies. The more context Guard has, the more accurate its threat detection will be.