Policy Impact Simulator

New Features

  • Dashboard: Policy Impact Simulator - an interactive tool that shows how different sensitivity levels and guardrail configurations would have affected your historical traffic. Available on policy view and edit pages, as well as in the policies list page as a column. Compare flagging rates across L1-L4 and see category-level breakdowns to tune your policies with confidence.

Container version: 2.0.493, tag: stable

New Features

  • Guard: For audio requests, the audio_payload flag in the request provides access to debugging information of the sample.
  • Gateway: For audio requests, sensitive audio material is no longer included in logs.

Quality

  • General: Model improvements based on client feedback.
  • Audio (TensorRT-LLM): Added denoising to audio processing.

Bug Fixes

  • Guard: Fixed a bug in policy handling for audio requests where L2 samples were mistakenly labeled as L3.
  • Gateway: Request metadata is no longer altered for logging purposes. User-supplied metadata is now treated strictly as provided.

Container version: 2.0.474, tag: stable

New Features

  • Gateway: Also log requests to v2/guard/audio for monitoring purposes.

Quality

  • Prompt injection: Improved detection of malicious behavioral instructions.
  • Prompt injection: Improved detection of system prompt exfiltration.
  • Prompt injection: Retrained model with improved coverage on new prompt attack variants.
  • Content moderation: Model update to include customer feedback.

Bug Fixes

  • Guard: Health probe fixes when gRPC encryption is turned on.
  • Gateway: Fix policy resolution for multi-message requests.

Security

  • Gateway: Security fixes (CVE-2026-25679, CVE-2026-27142, CVE-2026-27139).




Container version: 2.0.410, tag: stable

Improvements

Quality

  • Moderation model improvements: Updated moderation models to reduce FPRs, especially in weapons category.
  • Text preprocessing robustness: Improved handling of escaped JSON characters and edge cases in text decoding, reducing preprocessing errors and improving classifier reliability.
  • Whitelist refinements: Removed common phrases from whitelist to improve detection accuracy.



Container version: 2.0.328, tag: stable

Improvements

Platform

  • Fixed bug where pricing page was unavailable to community users.
  • Improved loading speed and page performance on Logs and Analytics pages.
  • Include logs flagged as “deny list” as threats in the Logs and Analytics pages.
  • Fixed bug on Analytics page where data would not load for certain date ranges.

Quality

  • Fixed bug with allowed domains handling.
  • Improved model recall on system prompt extraction attacks.