Overview
Every request routed through the Avaliar Proxy goes through three stages. Stage 1 validates the request and runs synchronous prompt injection detection. Stage 2 forwards the request to the LLM provider and captures the response. Stage 3 runs full safety detection on the complete trace in the background.Stage 1: Validate & Check
When a request arrives at the proxy, before anything is forwarded to the LLM provider:API Key ValidationThe proxy validates your Avaliar API key and confirms:
- The key is active and has not been revoked
- The key has the proxy scope
- Your organization has an active Pro plan
401 immediately and the request is not forwarded.Synchronous Prompt Injection DetectionThe proxy runs prompt injection detection on the incoming messages before forwarding them. If a prompt injection is detected:- The proxy returns a blocked response immediately
- The request is never forwarded to the LLM provider
- The blocked attempt is recorded as a trace on Avaliar
Prompt injection detection in Stage 1 is synchronous — harmful prompts are stopped before they ever reach your LLM provider.
Stage 2: Forward & Capture
If the request passes validation and prompt injection checks, the proxy forwards it to the LLM provider (OpenAI, Anthropic, or Gemini).Once the LLM responds, the proxy captures:
- The full response content
- End-to-end latency and proxy overhead (
proxy_overhead_ms) - Token counts (prompt tokens, completion tokens, total)
- Estimated cost based on model pricing
If
is_confidential: true is set on the request, the prompt and response content are not stored. Token counts and latency are still tracked.Stage 3: Async Detection
After the response is returned to your application, the proxy runs full safety detection on the complete trace in the background. This step does not block or delay the response.Detection covers:
- Jailbreak attempts
- Toxicity
- PII / sensitive data exposure
- Bias
- Hallucination
- Results are visible in the Avaliar Dashboard under the trace
- Any configured alerts are triggered based on detection results and severity
All detection beyond prompt injection runs asynchronously. Your application receives the LLM response at full speed — detection results appear in the dashboard shortly after.
Request Flow Summary
Latency Impact
The proxy is designed to add minimal overhead. Each trace includes aproxy_overhead_ms field so you can monitor the exact latency the proxy adds to each request. In practice, this is typically under 100ms.
Stage 3 detection runs entirely in the background and has no impact on response latency.