Connect to your agent
An agent target lets Red test your own API or agent endpoint. You provide a URL and authentication details; Red sends prompts and evaluates the responses. If you want to test a foundation model directly instead, see Connect to a Model.
When to use an agent target
- You are testing your own service: chatbot API, agent API, or wrapper around a model.
- Your endpoint is OpenAI-compatible (full conversation history per request) or implements the stateful contract (session ID, server keeps history).
- You need to test the full stack (your API, auth, and response handling), not just a raw model.
If your endpoint uses a protocol other than OpenAI-compatible or the stateful contract described below, please contact support@lakera.ai for assistance in setting up a custom connection.
API connection configuration
When creating a target, under Target configuration, select the Agent tab. Then configure the connection:

- API Endpoint – URL Red will POST to (e.g.
https://api.example.com/v1/chat/completionsfor stateless). The URL is where Red sends test requests; you can find it in your API documentation or integration settings. - Auth type – How Red authenticates to your API:
- None – No auth.
- API Key – Header or query parameter (you set the key name and value).
- Basic – Username and password (Basic auth header).
- Bearer Token –
Authorization: Bearer <token>.
Credentials are stored securely and not shown after saving; to change them, re-enter in the target form.
Authentication reference
Choose stateless or stateful
Choose how your API handles conversation history:

Use stateless if your endpoint already follows the OpenAI Chat Completions format. Use stateful if your backend maintains sessions (e.g. chatbot with memory).
Configure an agent target
Under Target configuration, select the Agent tab.

- API Endpoint – The URL where Red sends test requests (e.g.
https://api.example.com/v1/chat/completions). - Auth type – How Red authenticates to your endpoint: None, API Key, Basic, or Bearer Token. Credentials are stored securely and not shown after saving.
- Conversation history – Choose how your API handles multi-turn conversations:
- Stateless – Each request includes the full conversation history (OpenAI-compatible).
- Stateful – Server keeps conversation state; Red sends one message per request
with a
sessionId.
- In Advanced settings (optional):
- Additional fields – JSON object merged into every request body (e.g.
temperature). - Max concurrent requests – Number of concurrent requests Red sends to your endpoint (default 250; lower for rate-limited or resource-constrained setups).
- Additional fields – JSON object merged into every request body (e.g.
Choosing the right concurrency:
- Hosted LLM APIs (OpenAI, Anthropic, etc.): the default of 250 works well — these services are built for high concurrency.
- Self-hosted or resource-constrained targets: start at 10–50 to avoid overwhelming your endpoint.
- Unknown capacity: start low (e.g. 20) and increase if scans are slow and your target handles the load comfortably.
Higher concurrency means faster scans but more load on the target. If your target returns errors or times out during a scan, reduce the concurrency.
- Click Test connection to verify Red can reach your endpoint and receive a valid response.
Stateless (OpenAI-compatible)
Your endpoint must accept POST requests in the OpenAI chat completions format.
Request
Response
Stateful (session-based)
Your endpoint must conform to the following API contract. If your application uses a different protocol, you will need a translation layer or proxy that converts between your application’s format and the contract below.
Request
- First request: Red omits
sessionId. Your endpoint should create a new session and return its ID in the response. - Follow-up requests: Red includes the
sessionIdfrom the previous response to continue the conversation.
Response
For stateful endpoints, Test connection sends two messages and verifies the second
response returns the same sessionId.
Relevant resources
- Connect to a Model – test a foundation model directly without setting up an endpoint.