For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Curated dataset of prompts to test scanners detecting prompt injections, jailbreaks, and other potentially risky inputs.Contains text-embedding-ada-002 embeddings for all “jailbreak” prompts used by Vigil.
Categorised dataset of harmful instructions for the ALERT benchmark. Designed for testing content moderation and safety alignment in instruction-following models.
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles. This is an all-negative dataset for false-postive evaluation.