Skip to main content

Tailsampling Processor

Status Available in: contrib, k8s Maintainers: @portertech, @Logiraptor, @jmacd Source: opentelemetry-collector-contrib

Supported Telemetry

Traces

Overview

The tail sampling processor samples traces based on a set of defined policies. All spans for a given trace MUST be received by the same collector instance for effective sampling decisions. Before performing sampling, spans will be grouped by trace_id. Therefore, the tail sampling processor can be used directly without the need for the groupbytraceprocessor. This processor must be placed in pipelines after any processors that rely on context, e.g. k8sattributes. It reassembles spans into new batches, causing them to lose their original context. Please refer to config.go for the config spec. The following configuration options are required:
  • policies (no default): Policies used to make a sampling decision
Multiple policies exist today and it is straight forward to add more. These include:
  • always_sample: Sample all traces
  • latency: Sample based on the duration of the trace. The duration is determined by looking at the earliest start time and latest end time, without taking into consideration what happened in between. Supplying no upper bound will result in a policy sampling anything greater than threshold_ms.
  • numeric_attribute: Sample based on number attributes (resource and record) by min_value and/or max_value
  • probabilistic: Sample a percentage of traces. Read a comparison with the Probabilistic Sampling Processor.
  • status_code: Sample based upon the status code (OK, ERROR or UNSET)
  • string_attribute: Sample based on string attributes (resource and record) value matches, both exact and regex value matches are supported
  • trace_state: Sample based on TraceState value matches
  • trace_flags: Sample if the sampled trace flag was set on any span in the trace
  • rate_limiting: Sample based on the rate of spans per second.
  • bytes_limiting: Sample based on the rate of bytes per second using a token bucket algorithm implemented by golang.org/x/time/rate. This allows for burst traffic up to a configurable capacity while maintaining the average rate over time. The bucket is refilled continuously at the specified rate and has a maximum capacity for burst handling.
  • span_count: Sample based on the minimum and/or maximum number of spans, inclusive. If the sum of all spans in the trace is outside the range threshold, the trace will not be sampled.
  • boolean_attribute: Sample based on boolean attribute (resource and record).
  • ottl_condition: Sample based on given boolean OTTL condition (span and span event).
  • and: Sample based on multiple policies, creates an AND policy
  • not: Sample based on the opposite result a single policy, creates a NOT policy
  • drop: Drop (not sample) based on multiple policies, creates a DROP policy
  • composite: Sample based on a combination of above samplers, with ordering and rate allocation per sampler. Rate allocation allocates certain percentages of spans per policy order. For example if we have set max_total_spans_per_second as 100 then we can set rate_allocation as follows
    1. test-composite-policy-1 = 50 % of max_total_spans_per_second = 50 spans_per_second
    2. test-composite-policy-2 = 25 % of max_total_spans_per_second = 25 spans_per_second
    3. To ensure remaining capacity is filled use always_sample as one of the policies
The following configuration options can also be modified:
  • sampling_strategy (default = trace-complete): Controls decision timing and evaluation scope. trace-complete evaluates accumulated trace data on timer handling; span-ingest evaluates each incoming batch on ingest, finalizing terminal outcomes immediately and non-terminal traces on cleanup. See Sampling Strategies for details.
  • decision_wait (default = 30s): Time before timer handling for a trace. When sampling_strategy is trace-complete, this controls decision timing. When sampling_strategy is span-ingest, this controls pending cleanup finalization timing.
  • decision_wait_after_root_received (default = 0s): Additional root-span-based acceleration for timer handling. When sampling_strategy is trace-complete, this can make decisions earlier. When sampling_strategy is span-ingest, this can finalize pending traces earlier on cleanup. 0s disables it.
  • num_traces (default = 50000): Number of traces kept in memory.
  • expected_new_traces_per_sec (default = 0): Expected number of new traces (helps in allocating data structures)
  • decision_cache: Options for configuring caches for sampling decisions. You may want to vary the size of these caches depending on how many ā€œkeepā€ vs ā€œdropā€ decisions you expect from your policies. For example, you may allocate a larger non_sampled_cache_size if you expect most traces to be dropped. Additionally, if using, configure this as much greater than num_traces so decisions for trace IDs are kept longer than the span data for the trace.
    • sampled_cache_size (default = 0): Configures amount of trace IDs to be kept in an LRU cache, persisting the ā€œkeepā€ decisions for traces that may have already been released from memory. By default, the size is 0 and the cache is inactive.
    • non_sampled_cache_size (default = 0) Configures amount of trace IDs to be kept in an LRU cache, persisting the ā€œdropā€ decisions for traces that may have already been released from memory. By default, the size is 0 and the cache is inactive.
  • sample_on_first_match: Make decision as soon as a policy matches
  • drop_pending_traces_on_shutdown: Drop pending traces on shutdown instead of making a decision with the partial data already ingested.
  • maximum_trace_size_bytes: The maximum size a trace can reach in bytes, traces larger than this size will be immediately dropped from the tail sampling processor in order to protect the system.

Sampling Strategies

The sampling_strategy setting controls both decision timing and what data evaluators use:
  • trace-complete (default): evaluates on the timer path using accumulated trace data (after decision_wait, or earlier after root arrival when decision_wait_after_root_received is set). This is the most flexible mode for policies, but with later decisions and higher in-memory/storage pressure.
  • span-ingest: evaluates each incoming batch at ingest time without re-evaluating previously ingested batches. Terminal outcomes (sampled/dropped) finalize immediately; non-terminal outcomes stay pending and are finalized as not sampled during cleanup without policy re-evaluation.
Quick comparison:
  • Policy compatibility: trace-complete supports stateful policies; span-ingest rejects them.
  • Timer controls: in trace-complete, decision_wait and decision_wait_after_root_received affect decision timing; in span-ingest, they affect pending cleanup/finalization timing.
  • Late spans: decision caches remain important in both modes for spans that arrive after in-memory trace data is gone.

Policy Decision Flow

Each policy will result in a decision, and the processor will evaluate them to make a final decision:
  • When there’s a ā€œdropā€ decision, the trace is not sampled;
  • When there’s an ā€œinverted not sampleā€ decision, the trace is not sampled; Deprecated
  • When there’s a ā€œsampleā€ decision, the trace is sampled;
  • When there’s a ā€œinverted sampleā€ decision and no ā€œnot sampleā€ decisions, the trace is sampled; Deprecated
  • In all other cases, the trace is NOT sampled
An ā€œinvertedā€ decision is the one made based on the ā€œinvert_matchā€ attribute, such as the one from the string, numeric or boolean tag policy. There is an exception to this if the policy is within an and or composite policy, the resulting decision will be either sampled or not sampled. The ā€œinvertedā€ decisions have been deprecated, please make use of either
  • the drop policy to explicitly not sample select traces, or
  • the not policy to sample based on the opposite of the sampling decision of a policy (e.g., if a policy returns a ā€œsampleā€ decision -> not returns a ā€œnot sampleā€ decision)
Examples:
processors:
  tail_sampling:
    decision_wait: 10s
    num_traces: 100
    expected_new_traces_per_sec: 10
    decision_cache:
      sampled_cache_size: 100_000
      non_sampled_cache_size: 100_000
    policies:
      [
          {
            name: test-policy-1,
            type: always_sample
          },
          {
            name: test-policy-2,
            type: latency,
            latency: {threshold_ms: 5000, upper_threshold_ms: 10000}
          },
          {
            name: test-policy-3,
            type: numeric_attribute,
            numeric_attribute: {key: key1, min_value: 50, max_value: 100}
          },
          {
            name: test-policy-4,
            type: probabilistic,
            probabilistic: {sampling_percentage: 10}
          },
          {
            name: test-policy-5,
            type: status_code,
            status_code: {status_codes: [ERROR, UNSET]}
          },
          {
            name: test-policy-6,
            type: string_attribute,
            string_attribute: {key: key2, values: [value1, value2]}
          },
          {
            name: test-policy-7,
            type: string_attribute,
            string_attribute: {key: key2, values: [value1, val*], enabled_regex_matching: true, cache_max_size: 10}
          },
          {
            name: test-policy-8,
            type: rate_limiting,
            rate_limiting: {spans_per_second: 35}
         },
         {
            name: test-policy-9,
            type: bytes_limiting,
            bytes_limiting: {bytes_per_second: 1024000, burst_capacity: 2048000}
         },
         {
            name: test-policy-10,
            type: span_count,
            span_count: {min_spans: 2, max_spans: 20}
         },
         {
             name: test-policy-11,
             type: trace_state,
             trace_state: { key: key3, values: [value1, value2] }
         },
         {
              name: test-policy-12,
              type: boolean_attribute,
              boolean_attribute: {key: key4, value: true}
         },
         {
              name: test-policy-13,
              type: ottl_condition,
              ottl_condition: {
                   error_mode: ignore,
                   span: [
                        "attributes[\"test_attr_key_1\"] == \"test_attr_val_1\"",
                        "attributes[\"test_attr_key_2\"] != \"test_attr_val_1\"",
                   ],
                   spanevent: [
                        "name != \"test_span_event_name\"",
                        "attributes[\"test_event_attr_key_2\"] != \"test_event_attr_val_1\"",
                   ]
              }
         },
         {
            name: and-policy-1,
            type: and,
            and: {
              and_sub_policy:
              [
                {
                  name: test-and-policy-1,
                  type: numeric_attribute,
                  numeric_attribute: { key: key1, min_value: 50, max_value: 100 }
                },
                {
                    name: test-and-policy-2,
                    type: string_attribute,
                    string_attribute: { key: key2, values: [ value1, value2 ] }
                },
              ]
            }
         },
         {
            name: not-policy-1,
            type: not,
            not: {
              not_sub_policy: {
                name: test-not-policy-1,
                type: latency,
                latency: { threshold_ms: 1000 }
              }
            }
         },
         {
            name: drop-policy-1,
            type: drop,
            drop: {
              drop_sub_policy:
              [
                {
                    name: test-drop-policy-1,
                    type: string_attribute,
                    string_attribute: {key: url.path, values: [\/health, \/metrics], enabled_regex_matching: true}
                }
              ]
            }
         },
         {
            name: composite-policy-1,
            type: composite,
            composite:
              {
                max_total_spans_per_second: 1000,
                policy_order: [test-composite-policy-1, test-composite-policy-2, test-composite-policy-3],
                composite_sub_policy:
                  [
                    {
                      name: test-composite-policy-1,
                      type: numeric_attribute,
                      numeric_attribute: {key: key1, min_value: 50}
                    },
                    {
                      name: test-composite-policy-2,
                      type: string_attribute,
                      string_attribute: {key: key2, values: [value1, value2]}
                    },
                    {
                      name: test-composite-policy-3,
                      type: always_sample
                    }
                  ],
                rate_allocation:
                  [
                    {
                      policy: test-composite-policy-1,
                      percent: 50
                    },
                    {
                      policy: test-composite-policy-2,
                      percent: 25
                    }
                  ]
              }
          },
        ]
Refer to tail_sampling_config.yaml for detailed examples on using the processor.

Bytes Limiting Policy

The bytes_limiting policy uses a token bucket algorithm implemented by golang.org/x/time/rate to control the rate of data throughput based on the accurate protobuf marshaled size of traces calculated using the OpenTelemetry Collector’s built-in ProtoMarshaler.TracesSize() method. This policy is particularly useful for:
  • Volume control: Limiting the total amount of trace data processed per unit time
  • Burst handling: Allowing short-term spikes in data volume while maintaining long-term rate limits
  • Memory protection: Preventing downstream systems from being overwhelmed by large traces

Configuration

The bytes_limiting policy supports the following configuration parameters:
  • bytes_per_second: The sustained rate at which bytes are allowed through (required)
  • burst_capacity: The maximum number of bytes that can be processed in a burst (optional, defaults to 2x bytes_per_second)

Token Bucket Algorithm

The policy implements a token bucket algorithm where:
  1. Tokens represent bytes: Each token in the bucket represents one byte of trace data
  2. Continuous refill: Tokens are added to the bucket at the configured bytes_per_second rate
  3. Burst capacity: The bucket can hold up to burst_capacity tokens for handling traffic bursts
  4. Consumption: When a trace arrives, tokens equal to the trace size are consumed from the bucket
  5. Rejection: If insufficient tokens are available, the trace is not sampled

Example Configuration

processors:
  tail_sampling:
    policies:
      - name: volume-control
        type: bytes_limiting
        bytes_limiting:
          bytes_per_second: 1048576    # 1 MB/second sustained rate
          burst_capacity: 5242880      # 5 MB burst capacity
This configuration allows:
  • A sustained throughput of 1 MB/second (1,048,576 bytes/s)
  • Burst traffic up to 5 MB (5,242,880 bytes) before rate limiting kicks in
  • Smooth handling of variable trace sizes and timing

A Practical Example

Imagine that you wish to configure the processor to implement the following rules:
  1. Rule 1: Not all teams are ready to move to tail sampling. Therefore, sample all traces that are not from the team team_a.
  2. Rule 2: Sample only 0.1 percent of Readiness/liveness probes
  3. Rule 3: service-1 has a noisy endpoint /v1/name/{id}. Sample only 1 percent of such traces.
  4. Rule 4: Other traces from service-1 should be sampled at 100 percent.
  5. Rule 5: Sample all traces if there is an error in any span in the trace.
  6. Rule 6: Add an escape hatch. If there is an attribute called app.force_sample in the span, then sample the trace at 100 percent.
  7. Rule 7: Force spans with app.do_not_sample set to true to not be sampled, even if the result of the other rules yield a sampling decision.
Here is what the configuration would look like:
tail_sampling:
  decision_wait: 10s
  num_traces: 100
  expected_new_traces_per_sec: 10
  policies: [
      {
        # Rule 1: use always_sample policy for services that don't belong to team_a and are not ready to use tail sampling
        name: backwards-compatibility-policy,
        type: and,
        and:
          {
            and_sub_policy:
              [
                {
                  name: services-using-tail_sampling-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: service.name,
                      values:
                        [
                          list,
                          of,
                          services,
                          using,
                          tail_sampling,
                        ],
                      invert_match: true,
                    },
                },
                { name: sample-all-policy, type: always_sample },
              ],
          },
      },
      # BEGIN: policies for team_a
      {
        # Rule 2: low sampling for readiness/liveness probes
        name: team_a-probe,
        type: and,
        and:
          {
            and_sub_policy:
              [
                {
                  # filter by service name
                  name: service-name-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: service.name,
                      values: [service-1, service-2, service-3],
                    },
                },
                {
                  # filter by route
                  name: route-live-ready-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: http.route,
                      values: [/live, /ready],
                      enabled_regex_matching: true,
                    },
                },
                {
                  # apply probabilistic sampling
                  name: probabilistic-policy,
                  type: probabilistic,
                  probabilistic: { sampling_percentage: 0.1 },
                },
              ],
          },
      },
      {
        # Rule 3: low sampling for a noisy endpoint
        name: team_a-noisy-endpoint-1,
        type: and,
        and:
          {
            and_sub_policy:
              [
                {
                  name: service-name-policy,
                  type: string_attribute,
                  string_attribute:
                    { key: service.name, values: [service-1] },
                },
                {
                  # filter by route
                  name: route-name-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: http.route,
                      values: [/v1/name/.+],
                      enabled_regex_matching: true,
                    },
                },
                {
                  # apply probabilistic sampling
                  name: probabilistic-policy,
                  type: probabilistic,
                  probabilistic: { sampling_percentage: 1 },
                },
              ],
          },
      },
      {
        # Rule 4: high sampling for other endpoints
        name: team_a-service-1,
        type: and,
        and:
          {
            and_sub_policy:
              [
                {
                  name: service-name-policy,
                  type: string_attribute,
                  string_attribute:
                    { key: service.name, values: [service-1] },
                },
                {
                  # invert match - apply to all routes except the ones specified
                  name: route-name-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: http.route,
                      values: [/v1/name/.+],
                      enabled_regex_matching: true,
                      invert_match: true,
                    },
                },
                {
                  # apply probabilistic sampling
                  name: probabilistic-policy,
                  type: probabilistic,
                  probabilistic: { sampling_percentage: 100 },
                },
              ],
          },
      },
      {
        # Rule 5: always sample if there is an error
        name: team_a-status-policy,
        type: and,
        and:
          {
            and_sub_policy:
              [
                {
                  name: service-name-policy,
                  type: string_attribute,
                  string_attribute:
                    {
                      key: service.name,
                      values:
                        [
                          list,
                          of,
                          services,
                          using,
                          tail_sampling,
                        ],
                    },
                },
                {
                  name: trace-status-policy,
                  type: status_code,
                  status_code: { status_codes: [ERROR] },
                },
              ],
          },
      },
      {
        # Rule 6:
        # always sample if the force_sample attribute is set to true
        name: team_a-force-sample,
        type: boolean_attribute,
        boolean_attribute: { key: app.force_sample, value: true },
      },
      {
        # Rule 7:
        # never sample if the do_not_sample attribute is set to true
        name: do-not-sample,
        type: drop,
        drop: {
          drop_sub_policy:
            [
              {
                name: team_a-do-not-sample,
                type: boolean_attribute,
                string_attribute: { key: app.do_not_sample, value: true }
              }
            ]
        }
      },
      # END: policies for team_a
    ]

Scaling collectors with the tail sampling processor

This processor requires all spans for a given trace to be sent to the same collector instance for the correct sampling decision to be derived. When scaling the collector, you’ll then need to ensure that all spans for the same trace are reaching the same collector. You can achieve this by having two layers of collectors in your infrastructure: one with the load balancing exporter, and one with the tail sampling processor. While it’s technically possible to have one layer of collectors with two pipelines on each instance, we recommend separating the layers in order to have better failure isolation.

Probabilistic Sampling Processor compared to the Tail Sampling Processor with the Probabilistic policy

The probabilistic sampling processor and the probabilistic tail sampling processor policy work very similar: based upon a configurable sampling percentage they will sample a fixed ratio of received traces. But depending on the overall processing pipeline you should prefer using one over the other. As a rule of thumb, if you want to add probabilistic sampling and… …you are not using the tail sampling processor already: use the probabilistic sampling processor. Running the probabilistic sampling processor is more efficient than the tail sampling processor. The probabilistic sampling policy makes decision based upon the trace ID, so waiting until more spans have arrived will not influence its decision. …you are already using the tail sampling processor: add the probabilistic sampling policy. You are already incurring the cost of running the tail sampling processor, adding the probabilistic policy will be negligible. Additionally, using the policy within the tail sampling processor will ensure traces that are sampled by other policies will not be dropped.

FAQ

Q. Why am I seeing high values for the error metric sampling_trace_dropped_too_early? A. This is likely a load issue. If the collector is processing more traces in-memory than the num_traces configuration option allows, some will have to be dropped before they can be sampled. Increasing the value of num_traces can help resolve this error, at the expense of increased memory usage.

Monitoring and Tuning

See documentation.md for the full list metrics available for this component and their descriptions.

Dropped Traces

A circular buffer is used to ensure the number of traces in-memory doesn’t exceed num_traces. When a new trace arrives, the oldest trace is removed. This can cause a trace to be dropped before it’s sampled. To reduce the chance of this happening, either increase num_traces or decrease decision_wait. Both of those options increase memory usage. Number of Traces Dropped
otelcol_processor_tail_sampling_sampling_trace_dropped_too_early
Preemptively Preventing Dropped Traces A trace is dropped without sampling if it’s removed from the circular buffer before decision_wait. To track how long traces remain in the buffer use:
otelcol_processor_tail_sampling_sampling_trace_removal_age
It may be useful to calculate latency percentiles like p1 and compare that value to decision_wait. Values close to decision_wait are at risk of being dropped if trace volume increases. Slow Sampling Evaluation
otelcol_processor_tail_sampling_sampling_decision_timer_latency
This measures latency of sampling a batch of traces and passing sampled traces through the remainder of the collector pipeline. A latency exceeding 1 second can delay sampling decisions beyond decision_wait, increasing the chance of traces being dropped before sampling. It’s therefore recommended to consume this component’s output with components that are fast or trigger asynchronous processing.

Late-Arriving Spans

A span’s arrival is considered ā€œlateā€ if it arrives after its trace’s sampling decision is made. Late spans can cause different sampling decisions for different parts of the trace. There are two scenarios for late arriving spans:
  • Scenario 1: While the sampling decision of the trace remains in the circular buffer of num_traces length, the late spans inherit that decision. That means late spans do not influence the trace’s sampling decision.
  • Scenario 2: (Default, no decision cache configured) After the sampling decision is removed from the buffer, it’s as if this component has never seen the trace before: The late spans are buffered for decision_wait seconds and then a new sampling decision is made.
  • Scenario 3: (Decision cache is configured) When a ā€œkeepā€ decision is made on a trace, the trace ID is cached. The component will remember which trace IDs it sampled even after it releases the span data from memory. Unless it has been evicted from the cache after some time, it will remember the same ā€œkeep traceā€ decision.
Occurrences of Scenario 1 where late spans are not sampled can be tracked with the below histogram metric.
otelcol_processor_tail_sampling_sampling_late_span_age
It may also be useful to:
  • Calculate the percentage of spans arriving late with otelcol_processor_tail_sampling_sampling_late_span_age{le="+Inf"} / otelcol_processor_tail_sampling_count_spans_sampled. Note that count_spans_sampled requires enabling the processor.tailsamplingprocessor.metricstatcountspanssampled feature gate.
  • Visualize lateness as a histogram to see how much it can be reduced by increasing decision_wait.

Sampling Decision Frequency

Sampled Frequency To track the percentage of traces that were actually sampled, use:
otelcol_processor_tail_sampling_global_count_traces_sampled{sampled="true"} /
otelcol_processor_tail_sampling_global_count_traces_sampled
Sampling Policy Decision Frequency To see how often each policy votes to sample a trace, use:
sum (otelcol_processor_tail_sampling_count_traces_sampled{decision="sampled"}) by (policy) /
sum (otelcol_processor_tail_sampling_count_traces_sampled) by (policy)
As a reminder, a policy voting to sample the trace does not guarantee sampling; an ā€œinverted not sampleā€ or ā€œdropā€ decision from another policy would still discard the trace. Drop Policy Decision Frequency To track how often a drop policy votes to drop a trace, use:
sum (otelcol_processor_tail_sampling_count_traces_sampled{decision="dropped"}) by (policy) /
sum (otelcol_processor_tail_sampling_count_traces_sampled) by (policy)

Tracking sampling policy

To better understand which sampling policy made the decision to include a trace, you can enable tracking the policy responsible for sampling a trace via the processor.tailsamplingprocessor.recordpolicy feature gate. When this feature gate is set, this will add additional attributes on each sampled span:
AttributeDescriptionPresent?
tailsampling.policyRecords the configured name of the policy that sampled a traceAlways, unless trace was sampled by the decision cache
tailsampling.composite_policyRecords the configured name of a composite subpolicy that sampled a traceWhen composite policy used
tailsampling.cached_decisionRecords whether a trace was sampled by the decision cacheWhen decision cache used

Disable invert decisions

The invert sampling decisions (InvertSampled and InvertNotSampled) have been deprecated, however, they are still available. To disable them before their complete removal, you can use the processor.tailsamplingprocessor.disableinvertdecisions feature gate. When this feature gate is set, sampling policy invert_match will result in a Sampled or NotSampled decision instead of InvertSampled or InvertNotSampled. This applies to the string, numeric, and boolean tag policy. If you disable invert decisions, you can make use of a drop policy to explicitly not sample select traces or a not policy to sample based on the opposite of a sampling decision.

Policy Evaluation Errors

sampling_policy_evaluation_error

Attributes

Attribute NameDescriptionTypeValues
decisionThe sampling decisionstringsampled, not_sampled, dropped
policyName of the policystring
sampledWhether the sampling decision was sampled or not, false can mean either not sampled or droppedbool

Last generated: 2026-04-13