How to Use The Evaluation Framework

The evaluation framework offers a prescriptive guide for setting up Lakera Guard, assessing its efficacy and detection rates, measuring latency, and integrating it into various real-world use cases.

While tailored for Lakera Guard, this framework can be adapted as a general template for standardized detection system evaluation. It enables you to answer three key questions:

1

How good are Lakera Guard's detection capabilities?

Lakera’s threat detection accuracy is market leading. See our Prompt Injection Test (PINT) benchmark here for evidence. This takes into consideration both threat detection effectiveness as well as false positives on benign data importantly. When evaluating Lakera recommends using a Confusion Matrix for a standardized classification evaluation baseline.

2

How performant is Lakera Guard?

The Lakera Guard API is optimized for speed, delivering exceptionally low latency to minimize impact on user experience. Lakera advises collecting baseline latency metrics prior to integration.

3

How easy is Lakera Guard to integrate?

Recognizing the diverse and rapidly evolving use cases for GenAI, Lakera Guard integrates seamlessly into any architecture and deployment strategy. Integrate via a few lines of code to make API calls passing the inputs and outputs for screening per interaction.

Tips for successful evalations

We recommend following these recommendations in order to avoid common pitfalls:

1

Set up a project and assign a relevant policy

The Lakera Default Policy has our strictest flagging sensitivity so will flag anything Guard isn’t confident is safe. Use one of the Lakera recommended policies to assess Guard’s accuracy in real deployments.

2

Pass system prompts separately and correctly

A common attack vector is for users to insert malicious instructions as if they’re system instructions. Therefore in order to prevent Guard flagging your own system instructions make sure they’re passed in a separate message from LLM inputs and outputs with the message role as system.

3

Screen the original raw content from user inputs and reference documents

A common practice is to add additional system instructions or decorators to LLM inputs. Similar to the point above, in order to avoid false positives either screen the exact original input or remove any system instructions added to LLM inputs before passing to Guard as these are likely to be flagged as prompt attacks.

4

Screen for threats holistically

Lakera Guard performs best when screening interactions holistically with all appropriate guardrails applied. This means passing both LLM inputs and outputs in screening requests and using representative policies. The more context Guard has, the more accurate its threat detection will be.

Next step

To get started follow the setup guide for your chosen deployment option: