Overview
Policies let you define governance rules that automatically evaluate every AI trace your system produces. When a trace matches a rule — a hallucination is detected, PII leaks into a response, or a prompt injection attempt is made — the policy fires, records a violation, and takes the actions you configured. Policies are the proactive layer of Avaliar. Alerts tell you when something is wrong after the fact; policies watch every trace and immediately flag the ones that matter. Manage policies at app.avaliar.ai/policies.Policies require the Developer Pro plan. Upgrade your organization to create and import policies.
How Policies Work
Every time a trace is processed by Avaliar, all active policies in your organization are evaluated against it. If the trace matches a policy’s conditions, a violation is recorded and the policy’s configured actions are executed. The evaluation is automatic — you don’t need to trigger it manually. Any trace coming in from the Python SDK or the Proxy is checked.Policy Types
Policies are organized into four types based on what they govern.Content
Content
Watches for harmful, unsafe, or inappropriate content in AI outputs. Use content policies to catch toxicity, bias, misuse, misinformation, and prompt injection attempts.Best for: Moderation, safety guardrails, and responsible AI use.
Compliance
Compliance
Ensures your AI systems stay aligned with regulatory and legal standards. Compliance policies typically watch for PII/PHI exposure, harmful content, and false claims that could create legal liability.Best for: HIPAA, GDPR, FINRA, COPPA, and internal data protection requirements.
Threshold
Threshold
Triggers when detected issues cross a severity level. Use threshold policies to enforce a minimum quality bar — for example, blocking any trace with a critical-severity issue before it reaches production.Best for: Production readiness gates and quality assurance pipelines.
Usage
Usage
Monitors operational patterns such as off-topic requests, high-cost misuse, or excessive usage outside the agent’s intended purpose.Best for: Cost control, scope enforcement, and operational governance.
Conditions and Rules
Each policy has a set of rules that define what to watch for. A rule specifies an issue type and an optional minimum severity threshold.| Issue Type | What It Detects |
|---|---|
toxicity | Harmful, offensive, or inappropriate language |
hallucination | Fabricated facts, fake citations, invented data |
bias | Discriminatory or unfair outputs |
pii_leak | Names, emails, phone numbers, addresses, and other personal data |
prompt_injection | Instruction override and manipulation attempts |
misinformation | Verifiably false or misleading claims |
misuse | Responses outside the AI’s intended purpose |
- OR — the policy fires if any rule matches (most common)
- AND — the policy fires only if all rules match
low, medium, high, critical). Setting medium+ means the policy only fires if the issue is at medium severity or higher.
Enforcement Modes
The enforcement mode controls what happens when a policy condition is met.| Mode | Behavior |
|---|---|
| Monitor | Record the violation silently. Nothing changes in the trace flow — you get visibility without disruption. |
| Warn | Record the violation and surface a warning. Useful when you want to flag issues without blocking. |
Actions
When a policy fires, it can execute one or more actions:| Action | Description |
|---|---|
| Notify | Send an in-app notification when a violation occurs |
| Quarantine | Flag the trace for immediate review |
Priority Levels
Priority indicates the urgency of the policy and affects how violations are surfaced in the dashboard.| Priority | Use Case |
|---|---|
| Low | Background monitoring. Informational only. |
| Normal | Standard governance policies. |
| High | Important rules where violations should be investigated promptly. |
| Critical | Active safety and compliance requirements that demand immediate attention. |
Policy Lifecycle
A policy moves through a defined set of states.| Status | Meaning |
|---|---|
| Draft | Created but not yet running. No traces are evaluated against it. |
| Active | Running. Every incoming trace is evaluated against this policy. |
| Paused | Temporarily stopped. Violations stop accumulating until you reactivate it. |
| Deprecated | Retired. Kept for historical reference but no longer enforced. |
Approval Workflow
For organizations that need a review step before a policy goes live, Avaliar includes a built-in approval workflow.- Create a policy in Draft state
- Click Submit for Review — the policy moves to
Pending Review - A reviewer on your team Approves or Rejects the policy
- Approved policies are automatically activated
Version History and Rollback
Every time you edit a policy, Avaliar saves a version snapshot. You can view the full history of a policy’s changes and roll back to any previous version at any time. To access version history:- Click a policy to open the detail view
- Go to the History tab
- Each entry shows the change type, summary, and timestamp
- Click Rollback on any entry to restore that version
Templates
The Templates tab provides a library of pre-built policies for common use cases. Using a template creates a new Draft policy pre-filled with the relevant conditions and actions — you can review and customize it before activating. Templates are grouped into four categories:Security
Security
Policies for defending against threats and malicious behavior:
- Prompt Injection Defense — Block and alert on injection and jailbreak attempts
- PII/PHI Data Protection — Flag personally identifiable or health information in responses
- Toxic Content Shield — Detect harmful or inappropriate AI-generated content
- Misuse Prevention — Alert when AI is being used outside its intended scope
Compliance
Compliance
Policies pre-configured for regulatory frameworks:
- HIPAA Compliance — PHI exposure and PII detection for healthcare data
- GDPR Compliance — EU personal data protection
- Financial Services (FINRA) — Prevent unqualified investment advice and hallucinations in financial responses
- Children’s Safety (COPPA) — Block harmful content and data collection for child-focused applications
Quality
Quality
Policies for monitoring AI output quality and accuracy:
- Hallucination Guard — Detect fabricated facts and inaccuracies
- Bias & Fairness — Monitor for discriminatory outputs
- Misinformation Prevention — Flag false or misleading information
Operations
Operations
Broad governance and operational policies:
- Full Governance Suite — Catch-all policy that monitors all critical and high-severity issues
- Production Readiness Gate — Alert on any critical issue before scaling to production
Sandbox Testing
Before deploying a policy, use the Sandbox tab to test it against a sample prompt and response without affecting live data.- Go to Policies → Sandbox
- Select a policy to test against (or paste custom conditions)
- Enter a prompt and an optional AI response
- Click Run Test
- Whether the policy would fire (
WOULD FIREorPASS) - Which rules matched
- All detected issues with severity, confidence score, and the specific detector that found them
Historical Simulation
The Simulate feature lets you replay a policy against your existing trace history to see how it would have performed.- Open a policy and go to the Simulate tab
- Choose a time window (last 7, 14, 30, 60, or 90 days)
- Click Run Simulation
| Metric | Description |
|---|---|
| Traces Evaluated | Total traces in the selected time window |
| Would Have Matched | Number of traces that would have triggered the policy |
| Match Rate | Percentage of traces that would have fired |
| By Issue Type | Breakdown of which issue types drove the matches |
| Sample Traces | A set of example trace IDs that would have matched |
Violations
When an active policy fires, it creates a violation — a record linking the policy to the specific trace that triggered it. Each violation has three states:| Status | Meaning |
|---|---|
| Open | New violation that has not been reviewed |
| Acknowledged | Someone on your team has seen it and is aware |
| Resolved | The issue has been investigated and closed |
Managing Violations
View all violations across all policies in Policies → Violations. You can filter byopen, acknowledged, or resolved.
Each violation card shows:
- The policy that fired
- The trace ID (links directly to the trace in the Trace Explorer)
- The conditions that matched
- The severity
- When it was triggered
Policy-as-Code (Export and Import)
Policies can be exported and imported as JSON bundles, making it possible to version-control your governance configuration, share policies across organizations, or automate policy deployment.Exporting Policies
Click Export Policies on the Policies page to download all your policies as a single JSON bundle. To export a single policy, open its detail view and click Export. The exported bundle format looks like this:Importing Policies
On the New Policy page, select Import from JSON to upload a policy bundle. You can:- Validate the bundle first (dry-run) to check for errors without applying changes
- Choose a conflict strategy (
skip,overwrite, orrename) for policies that share a name with existing ones - Optionally auto-activate all imported policies immediately
Next Steps
Alerts
Set up reactive alerts that fire when metrics cross thresholds — a complement to proactive policies.
Detection
Learn about the detectors that power policy conditions and how they analyze your traces.
Reports
Generate compliance reports that include policy violation history.
Traces
Explore the traces that policy violations link back to.