Documentation Index
Fetch the complete documentation index at: https://otel.fyi/llms.txt
Use this file to discover all available pages before exploring further.
Spanpruning Processor
Available in: contrib
Maintainers: @portertech
Source: opentelemetry-collector-contrib
Supported Telemetry
Overview
Overview
The Span Pruning Processor identifies duplicate or similar leaf spans within a single trace, groups them, and replaces each group with a single aggregated summary span. When leaf spans are aggregated, the processor also recursively aggregates their parent spans if all children of those parents are being aggregated.
Leaf spans are spans that are not referenced as a parent by any other span in the trace. They typically represent the last actions in an execution call stack (e.g., individual database queries, HTTP calls to external services).
Spans are grouped by:
- Span name - spans must have the same name
- Span kind - spans must have the same kind (Internal, Server, Client, Producer, Consumer)
- Status code - spans must have the same status (OK, Error, or Unset)
- TraceState - spans must have identical TraceState values (for Consistent Probability Sampling compatibility)
- Configured attributes - spans must have matching values for attributes specified in
group_by_attributes
- Parent span name - leaf spans must share the same parent span name to be grouped together
Parent spans are eligible for aggregation when all of their children are aggregated, they share the same name, kind, and status code, and they are not root spans.
This processor is useful for reducing trace data volume while preserving meaningful information about repeated operations.
Use Cases
- Database query optimization: When an application makes many similar database queries (e.g., N+1 queries), aggregate them into a single summary span
- Batch operations: Consolidate many similar leaf operations into a single representative span
- Cost reduction: Reduce trace storage costs by eliminating redundant span data
Configuration
processors:
spanpruning:
# Attributes to use for grouping similar leaf spans (supports glob patterns)
# Spans with the same name AND same values for matching attributes will be grouped
# Examples:
# - "db.*" matches db.operation, db.name, db.statement, etc.
# - "http.request.*" matches http.request.method, http.request.header, etc.
# - "db.operation" matches only the exact key "db.operation"
group_by_attributes:
- "db.*"
- "http.method"
# Minimum number of similar leaf spans required before aggregation
# Default: 5
min_spans_to_aggregate: 3
# Maximum depth of parent span aggregation above leaf spans
# 0 = only aggregate leaf spans (no parent aggregation)
# -1 = unlimited depth
# Default: 1
max_parent_depth: 1
# Prefix for aggregation statistics attributes
# Default: "aggregation."
aggregation_attribute_prefix: "batch."
# Upper bounds for histogram buckets (latency distribution)
# Default: [5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s]
# Set to empty list to disable histogram attributes
aggregation_histogram_buckets: [10ms, 50ms, 100ms, 500ms, 1s]
Configuration Options
| Field | Type | Default | Description |
|---|
group_by_attributes | []string | [] | Attribute patterns for grouping (supports glob patterns like db.*) |
min_spans_to_aggregate | int | 5 | Minimum group size before aggregation occurs |
max_parent_depth | int | 1 | Max depth of parent aggregation (0=none, -1=unlimited) |
aggregation_attribute_prefix | string | ”aggregation.” | Prefix for aggregation statistics attributes |
aggregation_histogram_buckets | []time.Duration | [5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s] | Upper bounds for latency histogram buckets |
Glob Pattern Support
The group_by_attributes field supports glob patterns for matching attribute keys:
| Pattern | Matches |
|---|
db.* | db.operation, db.name, db.statement, etc. |
http.request.* | http.request.method, http.request.header.content-type, etc. |
rpc.* | rpc.method, rpc.service, rpc.system, etc. |
db.operation | Only the exact key db.operation |
When multiple attributes match a pattern, they are all included in the grouping key (sorted alphabetically for consistency).
Summary Span
When spans are aggregated, the summary span includes:
Properties
- Name: Original span name (e.g.,
SELECT)
- TraceID: Same as original spans
- SpanID: Newly generated unique ID
- ParentSpanID: Same as original spans (common parent)
- Kind: Same as template span (inherited from slowest span)
- StartTimestamp: Earliest start time of all spans in the group
- EndTimestamp: Latest end time of all spans in the group
- Status: Same as original spans (spans are grouped by status code)
- TraceState: Inherited from the template span (preserved for Consistent Probability Sampling compatibility)
- Attributes: Inherited from the slowest span in the group
Note: The summary span’s duration (EndTimestamp - StartTimestamp) represents the total time window covered by all aggregated spans, which may exceed duration_max_ns. For example, if spans overlap or are staggered, the time range can be larger than any individual span’s duration. Use duration_max_ns to find the slowest individual operation.
What Gets Aggregated Away
When spans are aggregated into a summary span, the following data from non-template spans is lost:
| Data | Behavior |
|---|
| Span Events | Events from the template (slowest) span are preserved |
| Span Links | Links from the template span are preserved |
| Attributes | Non-matching attribute values are lost |
| Individual Timestamps | Original start/end times replaced by the group’s time range |
| SpanIDs | Original SpanIDs are replaced by a single summary SpanID |
Aggregation Attributes
The following attributes are added to the summary span (shown with default aggregation_attribute_prefix: "aggregation."):
| Attribute | Type | Description |
|---|
<prefix>is_summary | bool | Always true to identify summary spans |
<prefix>span_count | int64 | Number of spans that were aggregated |
<prefix>duration_min_ns | int64 | Minimum duration in nanoseconds |
<prefix>duration_max_ns | int64 | Maximum duration in nanoseconds |
<prefix>duration_avg_ns | int64 | Average duration in nanoseconds |
<prefix>duration_total_ns | int64 | Total duration in nanoseconds |
<prefix>histogram_bucket_bounds_s | []float64 | Bucket upper bounds in seconds (excludes +Inf) |
<prefix>histogram_bucket_counts | []int64 | Cumulative count per bucket (includes +Inf bucket) |
Histogram Buckets
When aggregation_histogram_buckets is configured, summary spans include latency distribution data as cumulative histogram buckets. Cumulative means each bucket count includes all spans with duration less than or equal to that bucket boundary.
Worked example with buckets [10ms, 50ms, 100ms] and span durations [5ms, 15ms, 25ms, 75ms, 150ms]:
histogram_bucket_bounds_s: [0.01, 0.05, 0.1]
histogram_bucket_counts: [1, 3, 4, 5]
- Bucket 0 (<=10ms): 1 span (5ms)
- Bucket 1 (<=50ms): 3 spans (5ms, 15ms, 25ms)
- Bucket 2 (<=100ms): 4 spans (5ms, 15ms, 25ms, 75ms)
- Bucket 3 (+Inf): 5 spans (all spans)
Pipeline Placement
This processor is designed to work best when placed after processors that ensure complete traces are available:
service:
pipelines:
traces:
receivers: [otlp]
processors: [groupbytrace, spanpruning, batch]
exporters: [otlp]
Or with tail sampling:
service:
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling, spanpruning, batch]
exporters: [otlp]
Example
Basic Example
A trace with repeated database queries (some failing):
Before Processing:
root-span (parent)
├── SELECT (leaf) - duration: 10ms, db.operation: select, status: OK
├── SELECT (leaf) - duration: 15ms, db.operation: select, status: OK
├── SELECT (leaf) - duration: 12ms, db.operation: select, status: OK
├── SELECT (leaf) - duration: 50ms, db.operation: select, status: Error
├── SELECT (leaf) - duration: 45ms, db.operation: select, status: Error
└── INSERT (leaf) - duration: 20ms, db.operation: insert, status: OK
After Processing (with min_spans_to_aggregate: 2):
root-span (parent)
├── SELECT (summary, status: OK)
│ - aggregation.is_summary: true
│ - aggregation.span_count: 3
│ - aggregation.duration_min_ns: 10000000
│ - aggregation.duration_max_ns: 15000000
│ - aggregation.duration_avg_ns: 12333333
├── SELECT (summary, status: Error)
│ - aggregation.is_summary: true
│ - aggregation.span_count: 2
│ - aggregation.duration_min_ns: 45000000
│ - aggregation.duration_max_ns: 50000000
│ - aggregation.duration_avg_ns: 47500000
└── INSERT (unchanged - only 1 span, below threshold)
Note: Spans with different status codes are grouped separately, preserving error information.
Recursive Parent Aggregation Example
When spans are aggregated, the processor also checks if their parent spans can be aggregated. Parent spans are eligible for aggregation when:
- All of their children are being aggregated
- They share the same name, kind, and status code with other eligible parents
- They are not root spans (must have a parent)
- At least 2 parents meet the criteria
Before Processing (with min_spans_to_aggregate: 2, group_by_attributes: ["db.op"]):
root
├── handler (status: OK)
│ └── SELECT (db.op=select, status: OK) ───┐
├── handler (status: OK) │ leaf group A: 3 OK SELECTs
│ └── SELECT (db.op=select, status: OK) ───┤
├── handler (status: OK) │
│ └── SELECT (db.op=select, status: OK) ───┘
├── handler (status: Error)
│ └── SELECT (db.op=select, status: Error) ┐ leaf group B: 2 Error SELECTs
├── handler (status: Error) │
│ └── SELECT (db.op=select, status: Error) ┘
├── handler (status: OK)
│ └── INSERT (db.op=insert, status: OK) ──── only 1, below threshold
└── worker (status: OK)
└── SELECT (db.op=select, status: OK) ──── different parent name
After Processing:
root
├── handler (summary, status: OK, span_count: 3)
│ └── SELECT (summary, status: OK, span_count: 3)
├── handler (summary, status: Error, span_count: 2)
│ └── SELECT (summary, status: Error, span_count: 2)
├── handler (status: OK)
│ └── INSERT (status: OK) ─────────────────────────── unchanged
└── worker (status: OK)
└── SELECT (status: OK) ─────────────────────────── unchanged
Why each span was handled this way:
| Span | Result | Reason |
|---|
| 3x handler (OK) with SELECT children | Aggregated | All children aggregated, same name+kind+status |
| 3x SELECT (OK) under handler | Aggregated | Same name + kind + status + attributes + parent name |
| 2x handler (Error) with SELECT children | Aggregated | All children aggregated, same name+kind+status |
| 2x SELECT (Error) under handler | Aggregated | Same name + kind + status + attributes + parent name |
| handler (OK) with INSERT child | Unchanged | Child not aggregated (only 1 INSERT) |
| INSERT (OK) | Unchanged | Below threshold (only 1 span) |
| worker (OK) | Unchanged | Child not aggregated |
| SELECT (OK) under worker | Unchanged | Different parent name than other SELECTs |
Limitations
- Requires complete traces for accurate leaf detection
- Summary span inherits attributes from the slowest span in the group
- Parent spans are only aggregated when ALL their children are aggregated
Consistent Probability Sampling (CPS) Compatibility
The processor is designed to be compatible with Consistent Probability Sampling (CPS). CPS uses TraceState to carry sampling metadata (ot=th:...;rv:...) where:
th (threshold) indicates the sampling probability threshold
rv (randomness value) provides consistent randomness for sampling decisions
Why TraceState matters for aggregation:
Spans with different TraceState values represent different sampling populations with different “adjusted counts” (weights). Aggregating them together would produce statistically incorrect summaries and break downstream sampling decisions.
The processor uses exact TraceState matching (not just the th value) because:
- The
rv value affects sampling decisions
- Vendor-specific keys may have semantic meaning
- Key ordering may be significant
Telemetry
The processor emits the following metrics to help monitor its operation:
Counters
| Metric | Description |
|---|
otelcol_processor_spanpruning_spans_received | Total number of spans received by the processor |
otelcol_processor_spanpruning_spans_pruned | Total number of spans removed by aggregation |
otelcol_processor_spanpruning_aggregations_created | Total number of aggregation summary spans created |
otelcol_processor_spanpruning_traces_processed | Total number of traces processed |
Histograms
| Metric | Description |
|---|
otelcol_processor_spanpruning_aggregation_group_size | Distribution of the number of spans per aggregation group |
otelcol_processor_spanpruning_processing_duration | Time taken to process each batch of traces (in seconds) |
These metrics can be used to:
- Monitor the effectiveness of span pruning (compare
spans_received vs spans_pruned)
- Track the compression ratio achieved by aggregation
- Identify processing bottlenecks via
processing_duration
- Understand aggregation patterns via
aggregation_group_size
Configuration
Example Configuration
spanpruning:
group_by_attributes:
- "db.operation"
min_spans_to_aggregate: 5
aggregation_attribute_prefix: "aggregation."
spanpruning/custom:
group_by_attributes:
- "db.operation"
- "db.name"
min_spans_to_aggregate: 3
aggregation_attribute_prefix: "batch."
aggregation_histogram_buckets:
- "10ms"
- "50ms"
- "100ms"
- "500ms"
- "1s"
Last generated: 2026-04-20