Sampling

Sampling can be used to control the volume of traces collected by the Langfuse server.

You can configure the sample rate by setting the LANGFUSE_SAMPLE_RATE environment variable or by using the sample_rate parameter in the constructors of the Python SDK. The value has to be between 0 and 1.

The default value is 1, meaning that all traces are collected. A value of 0.2 means that only 20% of the traces are collected. The SDK samples on the trace level meaning that if a trace is sampled, all observations and scores within that trace will be sampled as well.

The v3 SDK is currently in beta. Please check out the SDK v3 for more details.

With Python SDK v3, you can configure sampling when initializing the client:

from langfuse import Langfuse
 
# Either set the environment variable or the constructor parameter
# The constructor parameter takes precedence
import os
os.environ["LANGFUSE_SAMPLE_RATE"] = "0.5"  # As string in env var
 
# Or directly in the constructor (as float)
langfuse = Langfuse(sample_rate=0.5)  # 50% of traces will be sampled

When using the @observe() decorator:

from langfuse import observe, Langfuse
 
# Initialize the client with sampling
Langfuse(sample_rate=0.3)  # 30% of traces will be sampled
 
@observe()
def process_data():
    # Only ~30% of calls to this function will generate traces
    # The decision is made at the trace level (first span)
    pass

If a trace is not sampled, none of its observations (spans or generations) or associated scores will be sent to Langfuse, which can significantly reduce data volume for high-traffic applications.

When using the @observe() decorator:

from langfuse.decorators import langfuse_context, observe
 
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
 
@observe()
def fn():
    pass
 
fn()

When using the low-level SDK:

from langfuse import Langfuse
 
# Either set the environment variable or the constructor parameter. The latter takes precedence.
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
langfuse = Langfuse(sample_rate=0.5)
 
trace = langfuse.trace(
  name="Rap Battle",
)

import { Langfuse } from "langfuse";
 
const langfuse = new Langfuse({
  sampleRate: 0.5,
});

See JS/TS SDK docs for more details.

When using the Python SDK v3, the sample rate provided on client initialization will apply to all event inputs and outputs regardless of the Langfuse-maintained integration you are using.

See the Python SDK v3 tab for more details.

When using the OpenAI SDK Integration with Python SDK v2:

# Either set the environment variable or configure the openai import. The latter takes precedence.
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
 
from langfuse.openai import openai
openai.langfuse_sample_rate = 0.5
 
completion = openai.chat.completions.create(
  name="test-chat",
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a calculator."},
    {"role": "user", "content": "1 + 1 = "}],
)

import OpenAI from "openai";
import { observeOpenAI } from "langfuse";
 
const openai = observeOpenAI(new OpenAI(), {
  clientInitParams: {
    sampleRate: 0.5,
  },
});

See OpenAI Integration (JS/TS) for more details.

When using the Python SDK v3, the sample rate provided on client initialization will apply to all event inputs and outputs regardless of the Langfuse-maintained integration you are using.

See the Python SDK v3 tab for more details.

When using the CallbackHandler with Python SDK v2:

from langfuse.callback import CallbackHandler
 
# Either set the environment variable or the constructor parameter. The latter takes precedence.
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
handler = CallbackHandler(
  sample_rate=0.5
)

import { CallbackHandler } from "langfuse-langchain";
 
const handler = new CallbackHandler({
  sampleRate: 0.5,
});

See Langchain Integration (JS/TS) for more details.

When using the Vercel AI SDK Integration

instrumentation.ts

import { registerOTel } from "@vercel/otel";
import { LangfuseExporter } from "langfuse-vercel";
 
export function register() {
  registerOTel({
    serviceName: "langfuse-vercel-ai-nextjs-example",
    traceExporter: new LangfuseExporter({ sampleRate: 0.5 }),
  });
}

The LlamaIndex integration is not supported in the Python SDK v3. Please use a community-maintained OTEL-based integration instead.

When using the LlamaIndex Integration with Python SDK v2:

import os
from langfuse.llama_index import LlamaIndexInstrumentor
 
# Either set the environment variable or the constructor parameter. The latter takes precedence.
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
instrumentor = LlamaIndexInstrumentor(sample_rate=0.5)

The LlamaIndex callback integration is not supported in the Python SDK v3. Please use a community-maintained OTEL-based integration instead.

When using the deprecated LlamaIndex Callback Integration with Python SDK v2:

from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from langfuse import langfuse
 
# Either set the environment variable or the constructor parameter. The latter takes precedence.
os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
langfuse_callback_handler = LlamaIndexCallbackHandler(sample_rate=0.5)
 
Settings.callback_manager = CallbackManager([langfuse_callback_handler])

Sampling

GitHub Discussions

Was this page useful?

Questions? We're here to help

Subscribe to updates