Malicious Links | Lakera API documentation

Malicious URLs are one of the most common attack vectors for phishing and spreading malware. Attackers will often use domains that are typos of popular websites or use a subdomain with the name of a popular website in an attempt to trick users into trusting a link.

These URLs can be especially dangerous in indirect prompt injection attacks where an attacker embeds some content into a document used as context for responding to a user prompt via retrieval augmented generation (RAG) or a similar GenAI setup.

Any time an LLM receives untrusted content from another source to help respond to an end user, it should be screened for prompt attacks and the LLM output screened for unknown links. This prevents attackers manipulating the LLM into displaying a phishing link.

Details

The Unknown Links detector flags URLs that are not one of the top one million most popular domains.

It can be customized by adding a list of allowed domains that are known to your application but might not be part of the million most popular domains. These trusted domains will not be flagged when detected in screened contents.

Allowed domains can be added via a policy. Please refer to the Policies documentation for more information and guides to doing this.

Do not include a subdomain when defining your allowed domains, please only include the domain name and top-level domain (TLD).

Note that the confidence threshold level of the unknown links detector cannot be fine-tuned.

Learn more

To help you learn more about protecting your application and its users from malicious URLs, we have curated more resources for you.

Indirect Prompt Injection

Learn more about indirect prompt injection from MITRE ATLAS

Malicious URLs

Learn more about malicious URLs from NordLayer

Slack AI Vulnerability

Learn how Slack’s AI can be indirectly prompt injected