Skip to main content

Spanmetrics Connector

Status Available in: contrib Maintainers: @portertech, @Frapschen, @iblancasa Source: opentelemetry-collector-contrib

Overview

⚠️ Breaking Change Warning: The default duration metrics unit will change from ms to s to adhere to the OpenTelemetry semantic conventions and a feature gate connector.spanmetrics.useSecondAsDefaultMetricsUnit is also added. Currently, the feature gate is disabled by default, so the unit will remain ms. After one release cycle, the unit will switch to s and the feature gate will also be enabled by default.

Overview

Aggregates Request, Error and Duration (R.E.D) OpenTelemetry metrics from span data. Request counts are computed as the number of spans seen per unique set of dimensions, including Errors. Multiple metrics can be aggregated if, for instance, a user wishes to view call counts just on service.name and span.name.
traces.span.metrics.calls{service.name="shipping",span.name="get_shipping/{shippingId}",span.kind="SERVER",status.code="Ok"}
Error counts are computed from the Request counts which have an Error Status Code metric dimension.
traces.span.metrics.calls{service.name="shipping",span.name="get_shipping/{shippingId},span.kind="SERVER",status.code="Error"}
Duration is computed from the difference between the span start and end times and inserted into the relevant duration histogram time bucket for each unique set dimensions.
traces.span.metrics.duration{service.name="shipping",span.name="get_shipping/{shippingId}",span.kind="SERVER",status.code="Ok"}
Each metric will have at least the following dimensions because they are common across all spans:
  • service.name
  • span.name
  • span.kind
  • status.code (or otel.status_code when the spanmetrics.statusCodeConvention.useOtelPrefix feature gate is enabled)
  • collector.instance.id
The collector.instance.id dimension is intended to add a unique UUID to all metrics, ensuring that the spanmetrics connector does not violate the Single Writer Principle when spanmetrics is used in a multi-deployment model. Currently, collector.instance.id must be manually enabled via the feature gate: connector.spanmetrics.includeCollectorInstanceID. More detail, please see Known Limitation: the Single Writer Principle

Span to Metrics processor to Span to metrics connector

The spanmetrics connector replaces spanmetrics processor with multiple improvements and breaking changes. It was done to bring the spanmetrics connector closer to the OpenTelemetry specification and make the component agnostic to exporters logic. The spanmetrics processor essentially was mixing the OTel with Prometheus conventions by using the OTel data model and the Prometheus metric and attributes naming convention. The following changes were done to the connector component. Breaking changes:
  • The operation metric attribute was renamed to span.name.
  • The latency histogram metric name was changed to duration.
  • The _total metric prefix was dropped from generated metrics names.
  • The Prometheus-specific metrics labels sanitization was dropped.
Improvements:
  • Added support for OTel exponential histograms for recording span duration measurements.
  • Added support for the milliseconds and seconds histogram units.
  • Added support for generating metrics resource scope attributes. The spanmetrics connector will generate the number of metrics resource scopes that corresponds to the number of the spans resource scopes meaning that more metrics are generated now. Previously, spanmetrics generated a single metrics resource scope.

Configurations

If you are not already familiar with connectors, you may find it helpful to first visit the Connectors README. The following settings can be optionally configured:
  • histogram (default: explicit): Use to configure the type of histogram to record calculated from spans duration measurements. Must be either explicit or exponential.
    • disable (default: false): Disable all histogram metrics.
    • unit (default: ms): The time unit for recording duration measurements. calculated from spans duration measurements. One of either: ms or s.
    • dimensions: additional attributes to add as dimensions to the traces.span.metrics.duration metric, which will be included on top of the common and configured dimensions for span attributes and resource attributes.
    • explicit:
      • buckets: the list of durations defining the duration histogram time buckets. Default buckets: [2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 400ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s]
    • exponential:
      • max_size (default: 160) the maximum number of buckets per positive or negative number range.
  • dimensions: the list of dimensions to add to traces.span.metrics.calls, traces.span.metrics.duration and traces.span.metrics.event metrics with the default dimensions defined above. Each additional dimension is defined with a name which is looked up in the span’s collection of attributes or resource attributes (AKA process tags) such as ip, host.name or region. If the named attribute is missing in the span, the optional provided default is used. If no default is provided, this dimension will be omitted from the metric.
  • calls_dimensions: additional attributes to add as dimensions to the traces.span.metrics.calls metric, which will be included on top of the common and configured dimensions for span attributes and resource attributes.
  • exclude_dimensions: the list of dimensions to be excluded from the default set of dimensions. Use to exclude unneeded data from metrics.
  • dimensions_cache_size: this setting is deprecated, please use aggregation_cardinality_limit instead.
  • include_instrumentation_scope: a list of instrumentation scope names to include from the traces.
  • resource_metrics_cache_size (default: 1000): the size of the cache holding metrics for a service. This is mostly relevant for cumulative temporality to avoid memory leaks and correct metric timestamp resets.
  • aggregation_temporality (default: AGGREGATION_TEMPORALITY_CUMULATIVE): Defines the aggregation temporality of the generated metrics. One of either AGGREGATION_TEMPORALITY_CUMULATIVE or AGGREGATION_TEMPORALITY_DELTA.
  • namespace (default: traces.span.metrics): Defines the namespace of the generated metrics. If namespace provided, generated metric name will be added namespace. prefix.
  • metrics_flush_interval (default: 60s): Defines the flush interval of the generated metrics.
  • metrics_expiration (default: 0): Defines the expiration time as time.Duration, after which, if no new spans are received, metrics will no longer be exported. Setting to 0 means the metrics will never expire (default behavior).
  • metric_timestamp_cache_size (default 1000): Only relevant for delta temporality span metrics. Controls the size of the cache used to keep track of a metric’s TimestampUnixNano the last time it was flushed. When a metric is evicted from the cache, its next data point will indicate a “reset” in the series. Downstream components converting from delta to cumulative, like prometheusexporter, may handle these resets by setting cumulative counters back to 0.
  • exemplars: Use to configure how to attach exemplars to metrics.
    • enabled (default: false): enabling will add spans as Exemplars to all metrics. Exemplars are only kept for one flush interval.rom the cache, its next data point will indicate a “reset” in the series. Downstream components converting from delta to cumulative, like prometheusexporter, may handle these resets by setting cumulative counters back to 0.
    • max_per_data_point (default: 5): The maximum number of exemplars to attach to a single metric data point.
  • events: Use to configure the events metric.
    • enabled: (default: false): enabling will add the events metric.
    • dimensions: (mandatory if enabled) the list of the span’s event attributes to add as dimensions to the traces.span.metrics.events metric, which will be included on top of the common and configured dimensions for span attributes and resource attributes.
  • resource_metrics_key_attributes: Filter the resource attributes used to produce the resource metrics key map hash(It’s only used to build the hash key, not copy the attributes to metrics resource attributes). Use this in case changing resource attributes (e.g. process id) are breaking counter metrics.
  • aggregation_cardinality_limit (default: 0): Defines the maximum number of unique combinations of dimensions that will be tracked for metrics aggregation. When the limit is reached, additional unique combinations will be dropped but registered under a new entry with otel.metric.overflow="true". A value of 0 means no limit is applied.
  • add_resource_attributes (default: false): Add the resource attributes to the resulting metrics. This option enables the old behavior before the connector.spanmetrics.excludeResourceMetrics feature gate was introduced. When set to true, resource attributes will be included in the metrics even if the feature gate is enabled. See GitHub issue #42103 for more context.
  • enable_metrics_sampling_method (default: false): When enabled, adds the sampling.method attribute to metrics with value "extrapolated" (when the span has a valid tracestate sampling threshold) or "counted" (otherwise).
The feature gate connector.spanmetrics.legacyMetricNames (disabled by default) controls the connector to use legacy metric names.

Examples

The following is a simple example usage of the spanmetrics connector. For configuration examples on other use cases, please refer to More Examples. The full list of settings exposed for this connector are documented in spanmetricsconnector/config.go.
receivers:
  nop:

exporters:
  nop:

connectors:
  spanmetrics:
    histogram:
      dimensions:
        - name: url.scheme
          default: https
      explicit:
        buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]  
    dimensions:
      - name: http.method
        default: GET
      - name: http.status_code
    calls_dimensions:
      - name: http.url
        default: /ping
    exemplars:
      enabled: true
    exclude_dimensions: ['status.code']
    aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"    
    metrics_flush_interval: 15s
    metrics_expiration: 5m
    events:
      enabled: true
      dimensions:
        - name: exception.type
        - name: exception.message
    resource_metrics_key_attributes:
      - service.name
      - telemetry.sdk.language
      - telemetry.sdk.name
    include_instrumentation_scope:
      - express

service:
  pipelines:
    traces:
      receivers: [nop]
      exporters: [spanmetrics]
    metrics:
      receivers: [spanmetrics]
      exporters: [nop]

Using spanmetrics with Prometheus components

The spanmetrics connector can be used with Prometheus exporter components. For some functionality of the exporters, e.g. like generation of the target_info metric the incoming spans resource scope attributes must contain service.name and service.instance.id attributes. Let’s look at the example of using the spanmetrics connector with the prometheusremotewrite exporter:
receivers:
  otlp:
    protocols:
      http:
      grpc:

exporters:
  prometheusremotewrite:
    endpoint: http://localhost:9090/api/v1/write
    target_info:
      enabled: true

connectors:
  spanmetrics:
    namespace: span.metrics

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [spanmetrics]
    metrics:
      receivers: [spanmetrics]
      exporters: [prometheusremotewrite]
This configures the spanmetrics connector to generate metrics from received spans and export the metrics to the Prometheus Remote Write exporter. The target_info metric will be generated for each resource scope, while OpenTelemetry metric names and attributes will be normalized to be compliant with Prometheus naming rules. For example, the generated calls OTel sum metric can result in multiple Prometheus calls_total (counter type) time series and the target_info time series. For example:
target_info{job="shippingservice", instance="...", ...} 1
calls_total{span_name="/Address", service_name="shippingservice", span_kind="SPAN_KIND_SERVER", status_code="STATUS_CODE_UNSET", ...} 142

More Examples

For more example configuration covering various other use cases, please visit the testdata directory.

Known Limitation: the Single Writer Principle

Proper configuration of the spanmetricsconnector ensures compliance with the Single Writer Principle, which is a core requirement in the OpenTelemetry metrics data model. Misconfiguration, however, may allow multiple components to write to the same metric stream, resulting in data inconsistency, metric conflicts, or the dropping of time series by metric backends.

Why this happens

This issue typically arises when:
  • Multiple pipelines use the same instance of the spanmetricsconnector
  • The connector is instantiated more than once without ensuring the resulting metric streams are distinct
  • The resource_metrics_key_attributes field is not configured correctly or includes common/shared attributes across all instances

Recommendations

To reduce the risk of conflicting writes:
  • Add resource_metrics_key_attributes to your configuration.
connectors:
  spanmetrics:
    resource_metrics_key_attributes:
      - service.name
      - telemetry.sdk.language
      - telemetry.sdk.name
  • Manually enable the feature gate: connector.spanmetrics.includeCollectorInstanceID to produce uniquely identified metrics.
  • For exporters like Prometheus, which rely on the single writer assumption, use a dedicated pipeline with a single spanmetricsconnector instance
More context is available in GitHub issue #21101.

About resource_metrics_key_attributes

The resource_metrics_key_attributes setting are used to build the key map that determines how metrics are grouped. If this field is left empty, the connector will use all available attributes to compute the resource metric hash. To avoid problems, be cautious when choosing which attributes to include. Avoid attributes that:
  • Change frequently – such as request_id, timestamp, or trace_id. These increase cardinality and create excessive metric streams.
  • Are shared across all sources – values like true, default, or team:backend offer no uniqueness and can lead to multiple writers sharing the same stream.
  • Are optional or inconsistently applied – if an attribute is only present in some spans, this can fragment metric streams (e.g., one stream with the attribute and one without).
Instead, use attributes that are stable, present in all spans, and meaningfully distinguish each stream. Good examples include cluster_id, region, or deployment_environment.

Troubleshooting span metrics high cardinality

High cardinality issues in span metrics commonly manifest in APM dashboards as an excessive number of service operations with non-unique names. Examples include URIs with unique identifiers (e.g., GET /product/1YMWWN1N4O) or HTTP parameters with random values (e.g., GET /?_ga=GA1.2.569539246.1760114706). These patterns render operation lists difficult to interpret and ineffective for monitoring purposes. This issue stems from violations of OpenTelemetry semantic conventions, which require span names to have low cardinality (e.g. HTTP span name specs). Beyond degrading APM interfaces with numerous non-meaningful operation names, this problem leads to metric time series explosion, resulting in significant performance degradation and increased costs. The span metrics connector provides an optional circuit breaker through the aggregation_cardinality_limit attribute (disabled by default) to mitigate cardinality explosion. While this feature addresses performance and cost concerns, it does not resolve the underlying issue of semantically meaningless operation names.

Fixing high cardinality span name issues

The ideal long-term solution is to modify the OpenTelemetry instrumentation code to comply with semantic conventions, preventing the generation of non-compliant high cardinality span names. However, deploying updated instrumentation libraries can be time-consuming, often requiring an immediate interim solution to restore observability backend functionality.

Addressing high cardinality span names in the ingestion pipeline

An effective short-term solution is to implement a span sanitization layer within the observability ingestion pipeline. This can be achieved by using the OpenTelemetry Collector Transform Processor’s set_semconv_span_name() function immediately before the Span Metrics Connector to enforce semantic conventions on span names. Aggressive span name sanitization may be overly restrictive for instrumentations with incomplete resource attributes. For instance, HTTP service operations may be reduced to generic names like GET and POST when HTTP spans lack the http.route attribute. This information loss can impact the monitoring of critical business operations. To preserve operation granularity, you can manually set the http.route attribute when detailed operation names are required. The missing http.route value can typically be derived through pattern matching on other span attributes such as http.target or url.full. Example OpenTelemetry Collector configuration that prevents cardinality explosion while preserving meaningful operation names on a service webshop/frontend:
receivers:
  otlp:
  ...

processors:
  transform/sanitize_spans:
    # Sanitize spans to prevent span metrics cardinality explosion caused by
    # non-compliant high cardinality span names:
    # 1. Fix incomplete semconv of critical operation spans to keep meaningful
    #    span metrics operation names, adding missing `http.route` and
    #    `http.request.method`.
    # 2. Sanitize all span names, note that http server spans lacking
    #    `http.route` will default to operations `GET`, `POST`, etc.

    error_mode: ignore
    trace_statements:
      # 1. Fix incomplete semconv on the critical http operations of the `frontend` service
      - context: span
        conditions:
          - span.kind == SPAN_KIND_SERVER and resource.attributes["service.name"] == "frontend" and resource.attributes["service.namespace"] == "webshop" and span.attributes["http.route"] == nil
        statements:
          - set(span.attributes["http.route"], "/api/checkout") where IsMatch(span.attributes["http.target"], "\\/api\\/checkout")  # e.g. # /api/checkout
          - set(span.attributes["http.route"], "/api/products/{productId}") where IsMatch(span.attributes["http.target"], "\\/api\\/products\\/.*")  # e.g. /api/products/1YMWWN1N4O

      # 1. Fix incomplete semconv on the critical http operations of other services...

      # 2. Sanitize all span names to prevent span metrics cardinality explosion.
      #    Unsanitized span names, when different, are kept in the `unsanitized_span_name` attribute
      - context: span
        statements:
          - set_semconv_span_name("1.37.0", "unsanitized_span_name")
  ...

connectors:
  spanmetrics:

exporters:
  otlp_http/observability-backend:
  ...

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform/sanitize_spans, ...]
      exporters: [otlp_http/observability-backend, spanmetrics]
    metrics:
      receivers: [otlp, spanmetrics]
      processors: [...]
      exporters: [otlp_http/observability-backend]
    # ...

Addressing high cardinality span names in the instrumentation code

The preferred long-term solution is to ensure span names and attributes comply with OpenTelemetry Semantic Conventions directly in the instrumentation code. Custom web frameworks are a common source of high cardinality span names. While default OpenTelemetry instrumentation (e.g., Java Servlet) may assign generic span names like GET /my-web-fwk/*, your framework has access to more specific routing information. By overwriting span attributes in your framework code, you can create compliant, low-cardinality span names that preserve operational granularity. Example: Custom Web Framework in Java Consider a custom web framework that intercepts the generic route /my-web-fwk/* and dispatches requests like /my-web-fwk/product/123456ABCD or /my-web-fwk/user/john.doe. The default Java Servlet instrumentation produces vague span names (GET /my-web-fwk/*), while directly using request URIs creates high cardinality (GET /my-web-fwk/product/123456ABCD). The solution is to override span attributes with templated route patterns like /my-web-fwk/product/{productId} or /my-web-fwk/user/{userId}:
@WebServlet(urlPatterns = "/my-web-fwk/*")
public class MyWebFrameworkServlet extends HttpServlet {

    @Override
    protected void service(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        // Default Servlet instrumentation sets vague span names: `GET /my-web-fwk/*` (and `http.route=/my-web-fwk/*`)
        // Using the request URI directly would cause high cardinality and violate semantic conventions
        // Instead, use the framework's low-cardinality routing information below

        // Example routing logic
        String uri = request.getRequestURI();
        MyWebOperation myWebOperation = getWebOperation(uri);

        // Fix span details to add details while complying with semantic conventions and maintaining low cardinality
        String httpRoute = "/my-web-fwk/" + myWebOperation.getSubHttpRoute();
        Span.current().setAttribute(HttpAttributes.HTTP_ROUTE, httpRoute);
        Span.current().updateName(request.getMethod() + " " + httpRoute);

        // execute the web operation
        myWebOperation.execute(request, response);
    }
    ...
}

Configuration

Example Configuration

# default configuration
spanmetrics/default:

# default configuration with explicit buckets histogram
spanmetrics/default_explicit_histogram:
  histogram:
    explicit:

# configuration with all possible parameters
spanmetrics/full:
  histogram:
    unit: "s"
    explicit:
      buckets: [ 10ms, 100ms, 250ms ]
  exemplars:
    enabled: true
  resource_metrics_cache_size: 1600

  # Additional list of dimensions on top of:
  # - service.name
  # - span.name
  # - span.kind
  # - status.code
  dimensions:
    # If the span is missing http.method, the connector will insert
    # the http.method dimension with value 'GET'.
    - name: http.method
      default: GET

    # If a default is not provided, the http.status_code dimension will be omitted
    # if the span does not contain http.status_code.
    - name: http.status_code

  # The aggregation temporality of the generated metrics.
  # Default: "AGGREGATION_TEMPORALITY_CUMULATIVE"
  aggregation_temporality: "AGGREGATION_TEMPORALITY_DELTA"

  # The period on which all metrics (whose dimension keys remain in cache) will be emitted.
  # Default: 60s.
  metrics_flush_interval: 30s

# default configuration with exponential buckets histogram
spanmetrics/exponential_histogram:
  histogram:
    exponential:
      max_size: 10

# invalid histogram configuration
spanmetrics/exponential_and_explicit_histogram:
  histogram:
    exponential:
      max_size: 10
    explicit:
      buckets: [ 10ms, 100ms, 250ms ]

spanmetrics/invalid_histogram_unit:
  histogram:
    unit: "h"

spanmetrics/invalid_metrics_expiration:
  metrics_expiration: -20s

# exemplars enabled 
spanmetrics/exemplars_enabled:
  exemplars:
    enabled: true

# exemplars enabled with max per datapoint configured
spanmetrics/exemplars_enabled_with_max_per_datapoint:
  exemplars:
    enabled: true
    max_per_data_point: 10

# resource metrics key attributes filter
spanmetrics/resource_metrics_key_attributes:
  resource_metrics_key_attributes:
    - service.name
    - telemetry.sdk.language
    - telemetry.sdk.name

spanmetrics/custom_delta_timestamp_cache_size:
  aggregation_temporality: "AGGREGATION_TEMPORALITY_DELTA"
  metric_timestamp_cache_size: 123

spanmetrics/invalid_delta_timestamp_cache_size:
  aggregation_temporality: "AGGREGATION_TEMPORALITY_DELTA"
  metric_timestamp_cache_size: 0

spanmetrics/default_delta_timestamp_cache_size:
  aggregation_temporality: "AGGREGATION_TEMPORALITY_DELTA"

spanmetrics/separate_calls_and_duration_dimensions:
  histogram:
    dimensions:
      - name: http.status_code
  dimensions:
    - name: http.method
      default: GET
  calls_dimensions:
    - name: http.url

Last generated: 2026-04-13