Datasets
Name | Type | # Prompts | Purpose |
---|---|---|---|
HotpotQA | Q&A | ~200k | Wikipedia-based question-answer pairs that require reasoning across multiple documents. Useful for evaluating false positives and over-triggering on natural Q&A. |
ChatGPT Jailbreak Prompts | Jailbreak | 79 | Collection of jailbreak related prompts for ChatGPT. Useful for evaluating the detection rate of publicly known jailbreaks. |
gandalf_ignore_instructions | Prompt Injection | 1k | Dataset containing a sample of prompt injections submitted to Lakera’s Gandalf prompt injection capture the flag game. Useful for evaluating detection rate on real-world adversarial prompt injection attempts. |
gandalf_summarization | Prompt Injection | 114 | Dataset containing a sample of indirect prompt injections submitted to Lakera’s Gandalf Adventure: the Summarizer level of Lakera’s Gandalf prompt injection capture the flag game. Useful for evaluating detection rate on real-world indirect prompt injections. |
mosscap_prompt_injection | Prompt Injection | ~278k | Dataset of prompts submitted to Lakera’s Mosscap prompt injection capture the flag game that was created as part of the DEF CON 31 AI Village Generative Red Team Challenge. Useful for evaluating detection rate on a large sample of real-world prompts that are a mixture of adversarial techniques and benign prompts. |
OpenAI Moderation Evaluation Dataset | Content Moderation | 1680 | Dataset of inputs that cover a wide range of content moderation use cases. Useful for evaluating the efficacy of content moderation. |
slug: docs/datasets
We update this list periodically, so check back for more public datasets and follow our Lakera HuggingFace repos or Lakera Twitter feed for updates when we release new datasets.