@traceable Decorator

Overview

The @traceable decorator automatically captures inputs, outputs, timing, and metadata for any function in your LLM pipeline. Decorate your functions to build a full trace tree that is submitted to the Avaliar platform for monitoring and analysis.

from avaliar import traceable

Parameter Reference

Parameter	Type	Default	Description
`span_type`	`"llm"` \| `"tool"` \| `"agent"` \| `"generic"`	required	Type of operation being traced
`model`	`str`	`None`	LLM model name (llm spans only)
`provider`	`str`	`None`	Provider name (llm spans only)
`temperature`	`float`	`None`	Sampling temperature (llm spans only)
`top_p`	`float`	`None`	Top-p parameter (llm spans only)
`detection`	`bool`	`False`	Enable safety detection on this span
`detectors`	`list[DetectorType]`	`[]`	Specific detectors to run
`detection_mode`	`"local"` \| `"cloud"`	`"local"`	Where detection processing runs
`blocking`	`bool`	`False`	Submit the prompt to Avaliar before calling the LLM. Raises `PromptBlockedError` if blocked. Requires Pro plan and `span_type="llm"`.

Span Types

Each span type captures different metadata and serves a different purpose in your trace tree.

llm -- LLM API calls

Use for direct calls to language model APIs. Captures messages, model configuration, and token usage.

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
    temperature=0.7,
    top_p=1.0,
)
async def chat(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

tool -- Tool and function calls

Use for tool executions, function calls, or any discrete operation performed by your agent.

@traceable("tool")
async def search_database(query: str) -> list[dict]:
    results = await db.search(query)
    return results

agent -- Agent orchestration steps

Use for high-level agent logic that coordinates multiple sub-operations.

@traceable("agent")
async def research_agent(question: str) -> str:
    context = await search_database(question)
    answer = await chat([
        {"role": "system", "content": f"Context: {context}"},
        {"role": "user", "content": question},
    ])
    return answer

generic -- Any other operation

Use for any operation that does not fit neatly into the other categories.

@traceable("generic")
async def process_request(request: dict) -> dict:
    validated = validate(request)
    result = await handle(validated)
    return result

Examples

Basic LLM Tracing

The simplest use case: trace an async OpenAI call.

from avaliar import traceable
from openai import AsyncOpenAI

client = AsyncOpenAI()

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Nested Spans

Build a trace tree by nesting @traceable functions. The SDK automatically links parent and child spans using context variables.

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def summarize(text: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content


@traceable("generic")
async def process_document(document: str) -> dict:
    summary = await summarize(document)
    return {"original": document, "summary": summary}

When process_document calls summarize, the trace tree looks like:

process_document (generic)
  └── summarize (llm)

Detection-Enabled Tracing

Add safety detection to any traced function by setting detection=True and specifying which detectors to run.

from avaliar import traceable
from avaliar.detectors import DetectorType

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
    detection=True,
    detectors=[
        DetectorType.PROMPT_INJECTION,
        DetectorType.TOXICITY,
        DetectorType.PII,
    ],
    detection_mode="local",
)
async def safe_generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Sync Function Tracing

The decorator works with synchronous functions as well.

from avaliar import traceable
from openai import OpenAI

client = OpenAI()

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
def generate_sync(messages: list) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    return response.choices[0].message.content

Generator Tracing

Trace generator functions that yield results incrementally. The SDK captures the full concatenated output once the generator completes.

@traceable(
    "llm",
    model="gpt-4o",
    provider="openai",
)
async def stream_generate(messages: list):
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True,
    )
    async for chunk in response:
        content = chunk.choices[0].delta.content
        if content:
            yield content

Updating token usage

Use update_current_llm_run to attach token counts to the current LLM span:

from avaliar import traceable
from avaliar.trace import update_current_llm_run
from openai import AsyncOpenAI

client = AsyncOpenAI()

@traceable("llm", model="gpt-4o", provider="openai")
async def generate(messages: list) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    update_current_llm_run(
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
    )
    return response.choices[0].message.content

Call update_current_llm_run from inside the decorated function, before returning. Token counts appear in the Traces dashboard and are included in cost calculations.

LLM message extraction

When span_type="llm", the SDK automatically extracts input messages for tracing. It looks for messages in two places:

The first positional argument: generate(messages)
The messages keyword argument: generate(messages=[...])

Messages should follow the standard role/content format:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
]

If the SDK cannot find messages in either location, a warning is emitted and the trace is still submitted without message data.

How It Works

Context Propagation

When a @traceable function is called, a new span is created and linked to any active parent span. Nested @traceable calls automatically inherit the parent, building the trace tree as your code executes.

Trace Tree Construction

As your code executes, the SDK builds a trace tree from the parent-child relationships between spans. Each span records its inputs, outputs, start time, end time, and any associated metadata.

Background Submission

Once the outermost span completes, the full trace tree is serialized and submitted to the Avaliar API in a background thread. Your application code is never blocked by trace submission.

Traces are submitted in background threads and will never block your application. If the Avaliar API is unreachable, traces are silently dropped — your application continues to run normally.

Use nested @traceable decorators to build a trace tree that shows the full execution flow of your LLM pipeline. This makes it easy to identify bottlenecks, debug issues, and understand how your agents make decisions.

​Overview

​Parameter Reference

​Span Types

​Examples

​Basic LLM Tracing

​Nested Spans

​Detection-Enabled Tracing

​Sync Function Tracing

​Generator Tracing

​Updating token usage

​LLM message extraction

​How It Works

Overview

Parameter Reference

Span Types

Examples

Basic LLM Tracing

Nested Spans

Detection-Enabled Tracing

Sync Function Tracing

Generator Tracing

Updating token usage

LLM message extraction

How It Works