Skip to main content

Overview

The @traceable decorator automatically captures inputs, outputs, timing, and metadata for any function in your LLM pipeline. Decorate your functions to build a full trace tree that is submitted to the Avaliar platform for monitoring and analysis.
from avaliar import traceable

Parameter Reference

ParameterTypeDefaultDescription
span_type"llm" | "tool" | "agent" | "generic"requiredType of operation being traced
modelstrNoneLLM model name (llm spans only)
providerstrNoneProvider name (llm spans only)
temperaturefloatNoneSampling temperature (llm spans only)
top_pfloatNoneTop-p parameter (llm spans only)
detectionboolFalseEnable safety detection on this span
detectorslist[DetectorType][]Specific detectors to run
detection_mode"local" | "cloud""local"Where detection processing runs
blockingboolFalseSubmit the prompt to Avaliar before calling the LLM. Raises PromptBlockedError if blocked. Requires Pro plan and span_type="llm".

Span Types

Each span type captures different metadata and serves a different purpose in your trace tree.
Use for direct calls to language model APIs. Captures messages, model configuration, and token usage.
@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
    temperature=0.7,
    top_p=1.0,
)
async def chat(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content
Use for tool executions, function calls, or any discrete operation performed by your agent.
@traceable("tool")
async def search_database(query: str) -> list[dict]:
    results = await db.search(query)
    return results
Use for high-level agent logic that coordinates multiple sub-operations.
@traceable("agent")
async def research_agent(question: str) -> str:
    context = await search_database(question)
    answer = await chat([
        {"role": "system", "content": f"Context: {context}"},
        {"role": "user", "content": question},
    ])
    return answer
Use for any operation that does not fit neatly into the other categories.
@traceable("generic")
async def process_request(request: dict) -> dict:
    validated = validate(request)
    result = await handle(validated)
    return result

Examples

Basic LLM Tracing

The simplest use case: trace an async OpenAI call.
from avaliar import traceable
from openai import AsyncOpenAI

client = AsyncOpenAI()

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Nested Spans

Build a trace tree by nesting @traceable functions. The SDK automatically links parent and child spans using context variables.
@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def summarize(text: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content


@traceable("generic")
async def process_document(document: str) -> dict:
    summary = await summarize(document)
    return {"original": document, "summary": summary}
When process_document calls summarize, the trace tree looks like:
process_document (generic)
  └── summarize (llm)

Detection-Enabled Tracing

Add safety detection to any traced function by setting detection=True and specifying which detectors to run.
from avaliar import traceable
from avaliar.detectors import DetectorType

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
    detection=True,
    detectors=[
        DetectorType.PROMPT_INJECTION,
        DetectorType.TOXICITY,
        DetectorType.PII,
    ],
    detection_mode="local",
)
async def safe_generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Sync Function Tracing

The decorator works with synchronous functions as well.
from avaliar import traceable
from openai import OpenAI

client = OpenAI()

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
def generate_sync(messages: list) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Generator Tracing

Trace generator functions that yield results incrementally. The SDK captures the full concatenated output once the generator completes.
@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def stream_generate(messages: list):
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True,
    )
    async for chunk in response:
        content = chunk.choices[0].delta.content
        if content:
            yield content

Updating token usage

Use update_current_llm_run to attach token counts to the current LLM span:
from avaliar import traceable
from avaliar.trace import update_current_llm_run
from openai import AsyncOpenAI

client = AsyncOpenAI()

@traceable("llm", model="gpt-4o", provider="openai")
async def generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    update_current_llm_run(
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
    )
    return response.choices[0].message.content
Call update_current_llm_run from inside the decorated function, before returning. Token counts appear in the Traces dashboard and are included in cost calculations.

LLM message extraction

When span_type="llm", the SDK automatically extracts input messages for tracing. It looks for messages in two places:
  1. The first positional argument: generate(messages)
  2. The messages keyword argument: generate(messages=[...])
Messages should follow the standard role/content format:
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
]
If the SDK cannot find messages in either location, a warning is emitted and the trace is still submitted without message data.

How It Works

1

Context Propagation

When a @traceable function is called, a new span is created and linked to any active parent span. Nested @traceable calls automatically inherit the parent, building the trace tree as your code executes.
2

Trace Tree Construction

As your code executes, the SDK builds a trace tree from the parent-child relationships between spans. Each span records its inputs, outputs, start time, end time, and any associated metadata.
3

Background Submission

Once the outermost span completes, the full trace tree is serialized and submitted to the Avaliar API in a background thread. Your application code is never blocked by trace submission.
Traces are submitted in background threads and will never block your application. If the Avaliar API is unreachable, traces are silently dropped — your application continues to run normally.
Use nested @traceable decorators to build a trace tree that shows the full execution flow of your LLM pipeline. This makes it easy to identify bottlenecks, debug issues, and understand how your agents make decisions.