Datasets

NameType# PromptsPurpose
HotpotQAQ&A~200kWikipedia-based question-answer pairs that require reasoning across multiple documents. Useful for evaluating false positives and over-triggering on natural Q&A.
ChatGPT Jailbreak PromptsJailbreak79Collection of jailbreak related prompts for ChatGPT. Useful for evaluating the detection rate of publicly known jailbreaks.
gandalf_ignore_instructionsPrompt Injection1kDataset containing a sample of prompt injections submitted to Lakera’s Gandalf prompt injection capture the flag game. Useful for evaluating detection rate on real-world adversarial prompt injection attempts.
gandalf_summarizationPrompt Injection114Dataset containing a sample of indirect prompt injections submitted to Lakera’s Gandalf Adventure: the Summarizer level of Lakera’s Gandalf prompt injection capture the flag game. Useful for evaluating detection rate on real-world indirect prompt injections.
mosscap_prompt_injectionPrompt Injection~278kDataset of prompts submitted to Lakera’s Mosscap prompt injection capture the flag game that was created as part of the DEF CON 31 AI Village Generative Red Team Challenge. Useful for evaluating detection rate on a large sample of real-world prompts that are a mixture of adversarial techniques and benign prompts.
OpenAI Moderation Evaluation DatasetContent Moderation1680Dataset of inputs that cover a wide range of content moderation use cases. Useful for evaluating the efficacy of content moderation.

slug: docs/datasets

We update this list periodically, so check back for more public datasets and follow our Lakera HuggingFace repos or Lakera Twitter feed for updates when we release new datasets.