Skip to main content

Prometheus Receiver

Status Available in: core, contrib, k8s Maintainers: @Aneurysm9, @dashpole, @ArthurSens, @krajorama Source: opentelemetry-collector-contrib

Supported Telemetry

Metrics

Overview

See the Design for additional information on this receiver.

⚠️ Warning

Note: This component is currently work in progress. It has several limitations and please don’t use it if the following limitations are a concern:
  • Collector cannot auto-scale the scraping yet when multiple replicas of the collector are run.
  • When running multiple replicas of the collector with the same config, it will scrape the targets multiple times.
  • Users need to configure each replica with different scraping configuration if they want to manually shard the scraping.
  • The Prometheus receiver is a stateful component.

Unsupported features

The Prometheus receiver is meant to minimally be a drop-in replacement for Prometheus. However, there are advanced features of Prometheus that we don’t support and thus explicitly will return an error for if the receiver’s configuration YAML/code contains any of the following
  • alert_config.alertmanagers
  • alert_config.relabel_configs
  • remote_read
  • remote_write
  • rule_files

Getting Started

This receiver is a drop-in replacement for getting Prometheus to scrape your services. It supports the full set of Prometheus configuration in scrape_config, including service discovery. Just like you would write in a YAML configuration file before starting Prometheus, such as with: Note: Since the collector configuration supports env variable substitution $ characters in your prometheus configuration are interpreted as environment variables. If you want to use $ characters in your prometheus configuration, you must escape them using $$.
prometheus --config.file=prom.yaml
You can copy and paste that same configuration under the config attribute:
receivers:
    prometheus:
      config:
        scrape_configs:
          - job_name: 'otel-collector'
            scrape_interval: 5s
            static_configs:
              - targets: ['0.0.0.0:8888']
          - job_name: k8s
            kubernetes_sd_configs:
            - role: pod
            relabel_configs:
            - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
              regex: "true"
              action: keep
            metric_relabel_configs:
            - source_labels: [__name__]
              regex: "(request_duration_seconds.*|response_duration_seconds.*)"
              action: keep
The prometheus receiver also supports additional top-level options:
  • trim_metric_suffixes: [Experimental] When set to true, this enables trimming unit and some counter type suffixes from metric names. For example, it would cause singing_duration_seconds_total to be trimmed to singing_duration. This can be useful when trying to restore the original metric names used in OpenTelemetry instrumentation. Defaults to false.
Example configuration:
receivers:
    prometheus:
      trim_metric_suffixes: true
      config:
        scrape_configs:
          - job_name: 'otel-collector'
            scrape_interval: 5s
            static_configs:
              - targets: ['0.0.0.0:8888']

Complete Configuration Example

The following example demonstrates a complete end-to-end configuration showing how to use the Prometheus receiver with processors and exporters in a service pipeline:
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'my-service'
          scrape_interval: 5s
          static_configs:
            - targets: ['localhost:9090']
          # Filter metrics to keep only those matching the regex pattern
          metric_relabel_configs:
            - source_labels: [__name__]
              regex: 'http_request_duration_seconds.*'
              action: keep

processors:
  resource:
    attributes:
      # Note: service.name is automatically set by the prometheus receiver from job_name
      - key: deployment.environment
        value: production
        action: upsert

exporters:
  otlp_grpc:
    endpoint: otel-collector:4317
    tls:
      insecure: false
      # For local testing only you may set `insecure: true`, but avoid this in production.
    sending_queue:
      batch:
        timeout: 10s
        send_batch_size: 1000
  prometheusremotewrite:
    endpoint: https://prometheus:9090/api/v1/write
    sending_queue:
      batch:
        timeout: 10s
        send_batch_size: 1000

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resource]
      exporters: [otlp_grpc, prometheusremotewrite]
This configuration:
  • Scrapes metrics from a service running on localhost:9090 every 5 seconds
  • Filters metrics to keep only those matching http_request_duration_seconds.* using metric_relabel_configs
  • Adds resource attributes (deployment.environment) to all metrics (note: service.name is automatically set from the job name)
  • Uses exporter-level batching via sending_queue.batch to improve efficiency when multiple scrapes occur
  • Exports metrics to both an OTLP endpoint and Prometheus remote write endpoint

Prometheus native histograms

Native histograms are a data type in Prometheus, for more information see the specification. The Prometheus receiver automatically converts native histograms to OpenTelemetry exponential histograms. To enable scraping and ingestion of native histograms, you need to configure two things in your Prometheus scrape config:
  1. Enable native histogram scraping: Set scrape_native_histograms: true (globally or per-job)
  2. Use the protobuf scrape protocol: Include PrometheusProto in scrape_protocols (required until Prometheus supports native histograms over text formats)
receivers:
  prometheus:
    config:
      global:
        # Required: Include PrometheusProto to scrape native histograms
        scrape_protocols: [ PrometheusProto, OpenMetricsText1.0.0, OpenMetricsText0.0.1, PrometheusText0.0.4 ]
        # Enable native histogram scraping globally
        scrape_native_histograms: true
      scrape_configs:
        - job_name: 'my-app'
          # Per-job setting takes precedence over global
          # scrape_native_histograms: true
          static_configs:
            - targets: ['localhost:8080']
This feature applies to the most common integer counter histograms; gauge histograms are dropped. In case a metric has both the conventional (aka classic) buckets and also native histogram buckets, only the native histogram buckets will be taken into account to create the corresponding exponential histogram. To scrape the classic buckets instead use the scrape option always_scrape_classic_histograms.

OpenTelemetry Operator

Additional to this static job definitions this receiver allows to query a list of jobs from the OpenTelemetryOperators TargetAllocator or a compatible endpoint.
receivers:
  prometheus:
    target_allocator:
      endpoint: http://my-targetallocator-service
      interval: 30s
      collector_id: collector-1
The target_allocator section embeds the full confighttp client configuration.

Exemplars

This receiver accepts exemplars coming in Prometheus format and converts it to OTLP format.
  1. Value is expected to be received in float64 format
  2. Timestamp is expected to be received in ms
  3. Labels with key span_id in prometheus exemplars are set as OTLP span id and labels with key trace_id are set as trace id
  4. Rest of the labels are copied as it is to OTLP format

Resource and Scope

This receiver drops the target_info prometheus metric, if present, and uses attributes on that metric to populate the OpenTelemetry Resource. It drops otel_scope_name and otel_scope_version labels, if present, from metrics, and uses them to populate the OpenTelemetry Instrumentation Scope name and version. It drops the otel_scope_info metric, and uses attributes (other than otel_scope_name and otel_scope_version) to populate Scope Attributes.

Prometheus API Server

The Prometheus API server can be enabled to host info about the Prometheus targets, config, service discovery, and metrics. The server_config can be specified using the OpenTelemetry confighttp package. An example configuration would be:
receivers:
  prometheus:
    api_server:
      enabled: true
      server_config:
        endpoint: "localhost:9090"
The API server hosts the same paths as the Prometheus agent-mode API. These include: More info about querying /api/v1/ and the data format that is returned can be found in the Prometheus documentation.

Feature gates

See documentation.md for the complete list of feature gates supported by this receiver. Feature gates can be enabled using the --feature-gates flag:
"--feature-gates=<feature-gate>"

Benchmark Results

Current Prometheus receiver benchmark results are published on the Collector Benchmarks page. The table below links directly to the current Prometheus receiver charts by scenario and metric type.
ScenarioCPUMemory
Baseline, 1k metricsCPUMemory
Baseline, 10k metricsCPUMemory
Native histograms, 10k metricsCPUMemory
target_info enabled, 10k metricsCPUMemory

Troubleshooting and Best Practices

This section provides guidance for common issues, performance optimization, and best practices when using the Prometheus receiver in production environments.

Common Issues and Solutions

Metrics Not Appearing

Symptoms: Metrics are not being scraped or exported despite correct configuration. Possible Causes and Solutions:
  1. Target Not Reachable
    • Verify network connectivity between the collector and target endpoints
    • Check firewall rules and security groups
    • Test connectivity using curl or wget to the target’s metrics endpoint
  2. Incorrect Scrape Configuration
    • Verify scrape_configs syntax matches Prometheus format
    • Check that targets are correctly formatted (e.g., ['hostname:port'])
    • Ensure job_name is unique and descriptive
  3. Metric Filtering Too Aggressive
    • Review metric_relabel_configs to ensure desired metrics are not being dropped
    • Temporarily remove filters to verify metrics are being scraped
    • Use the Prometheus API server (if enabled) to inspect active targets
  4. Service Discovery Not Working
    • For Kubernetes service discovery, verify RBAC permissions for service account
    • Check that service discovery configurations match your environment
    • Review collector logs for service discovery errors
Debugging Steps:

receivers:
  prometheus:
    api_server:
      enabled: true
      server_config:
        endpoint: "localhost:9090"
Then query /api/v1/targets to see target status and any scrape errors.
  • Enable debug logs: You can also enable debug-level logs in the collector to see detailed scrape errors in logs:
service:
  telemetry:
    logs:
      level: debug  # Use with caution in production
This will surface detailed scrape errors and help diagnose connectivity or configuration issues.

High CPU Usage

Symptoms: Collector consuming excessive CPU resources, especially with high metric volumes. Solutions:
  1. Optimize Scrape Intervals
    • Increase scrape_interval for less critical metrics
    • Use different intervals for different jobs based on metric importance
    • Consider using scrape_timeout to prevent long-running scrapes
  2. Reduce Metric Volume
    • Use metric_relabel_configs to drop unnecessary metrics at scrape time
    • Filter metrics before they enter the pipeline to reduce processing overhead
    • Consider using the filter processor for more complex filtering logic
  3. Disable Expensive Features
    • Avoid enabling receiver.prometheusreceiver.EnableCreatedTimestampZeroIngestion unless necessary (known CPU impact)
    • Use exporter-level batching to reduce export frequency
    • Consider disabling extra scrape metrics if not needed
Example Configuration:
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'high-frequency'
          scrape_interval: 30s  # Increased from default
          scrape_timeout: 10s    # Prevent hanging scrapes
          metric_relabel_configs:
            # Drop verbose metrics to reduce volume
            - source_labels: [__name__]
              regex: 'go_.*'
              action: drop

Memory Issues

Symptoms: Collector running out of memory, especially with many targets or long scrape intervals. Solutions:
  1. Limit Target Count
    • Use service discovery filters to reduce number of targets
    • Implement manual sharding across multiple collector instances
    • Use TargetAllocator for automatic sharding in Kubernetes
  2. Optimize Batch Processing
    • Configure exporter-level batching with appropriate send_batch_size and timeout via sending_queue.batch
    • Balance between memory usage (smaller batches) and efficiency (larger batches)
  3. Monitor Memory Usage
    • Enable the memory_limiter processor to prevent OOM conditions
    • Set appropriate memory limits based on your metric volume
Example Configuration:
processors:
  memory_limiter:
    limit_mib: 512
    check_interval: 1s

exporters:
  otlp:
    endpoint: otel-collector:4317
    sending_queue:
      batch:
        timeout: 10s
        send_batch_size: 1000  # Adjust based on memory constraints

Best Practices for Production

Multi-Replica Deployments

When running multiple collector replicas, manually shard scraping to avoid duplicate metrics: Option 1: Manual Sharding by Job
# Collector Replica 1
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'service-a'
          static_configs:
            - targets: ['service-a:9090']
        - job_name: 'service-b'
          static_configs:
            - targets: ['service-b:9090']

# Collector Replica 2
receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'service-c'
          static_configs:
            - targets: ['service-c:9090']
        - job_name: 'service-d'
          static_configs:
            - targets: ['service-d:9090']
Option 2: Use TargetAllocator (Recommended for Kubernetes)
receivers:
  prometheus:
    target_allocator:
      endpoint: http://targetallocator-service:8080
      interval: 30s
      collector_id: ${HOSTNAME}  # Unique per replica

Performance Optimization

  1. Scrape Interval Tuning
    • Critical metrics: 5-15 seconds
    • Standard metrics: 30-60 seconds
    • Low-priority metrics: 2-5 minutes
  2. Metric Filtering Strategy
    • Filter at scrape time using metric_relabel_configs (most efficient)
    • Use filter processor for complex logic
    • Avoid filtering in exporters when possible
  3. Resource Management
    • Always use memory_limiter processor in production
    • Configure appropriate resource limits in Kubernetes
    • Monitor collector metrics (CPU, memory, scrape duration)
Example Production Configuration:
receivers:
  prometheus:
    config:
      global:
        scrape_interval: 30s
        scrape_timeout: 10s
      scrape_configs:
        - job_name: 'critical-services'
          scrape_interval: 15s
          static_configs:
            - targets: ['service1:9090', 'service2:9090']
          metric_relabel_configs:
            - source_labels: [__name__]
              regex: 'http_request_duration_seconds.*|http_request_total'
              action: keep

processors:
  memory_limiter:
    limit_mib: 1024
    check_interval: 1s
  resource:
    attributes:
      - key: deployment.environment
        value: production
        action: upsert

exporters:
  otlp_grpc:
    endpoint: otel-collector:4317
    tls:
      insecure: false
      ca_file: /etc/ssl/certs/ca-certificates.crt
    sending_queue:
      batch:
        timeout: 10s
        send_batch_size: 2000

Monitoring the Receiver

Monitor the Prometheus receiver itself to ensure it’s operating correctly:
  1. Enable Extra Scrape Metrics
    • In the Prometheus config set extra_scrape_metrics to true in the global section.
  2. Key Metrics to Monitor:
    • prometheus_receiver_scrapes_total: Total number of scrapes
    • prometheus_receiver_scrape_errors_total: Number of failed scrapes
    • prometheus_receiver_target_scrapes_exceeded_timeout_total: Timeouts
    • Collector’s internal metrics (CPU, memory, pipeline metrics)
  3. Set Up Alerts:
    • Alert on high scrape error rates
    • Alert on scrape timeouts
    • Alert on collector memory/CPU usage

Security Considerations

  1. TLS Configuration
    • Always use TLS for exporter endpoints in production
    • Use proper certificate management
    • Consider using mTLS for enhanced security
  2. Network Security
    • Restrict network access to metrics endpoints
    • Use service meshes or network policies to limit exposure
    • Consider using authentication for sensitive metrics endpoints
  3. Configuration Security
    • Avoid hardcoding credentials in configuration files
    • Use environment variable substitution for sensitive values
    • Implement proper secret management (e.g., Kubernetes secrets)

Debugging Tips

  1. Enable Verbose Logging
    service:
      telemetry:
        logs:
          level: debug  # Use with caution in production
    
  2. Use Prometheus API Server
    • Enable API server to inspect targets, config, and scrape pools
    • Query /api/v1/targets to see target health
    • Check /api/v1/status/config to verify configuration
  3. Test Configuration
    • Validate YAML syntax before deployment
    • Test with a single job first, then expand
    • Use otelcol with --dry-run flag if available
  4. Check Collector Logs
    • Look for scrape errors, timeouts, or connection issues
    • Monitor for memory or CPU warnings
    • Review service discovery logs for Kubernetes deployments

Additional Resources

Configuration

config.yaml (testdata)

prometheus:
prometheus/customname:
  trim_metric_suffixes: true
  target_allocator:
    endpoint: http://my-targetallocator-service
    interval: 30s
    collector_id: collector-1
    # imported struct from the Prometheus code base. Can be used optionally to configure the jobs as seen in the docs
    # https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config
    http_sd_config:
      refresh_interval: 60s
      basic_auth:
        username: "prometheus"
        password: "changeme"
    http_scrape_config:
      basic_auth:
        username: "scrape_prometheus"
        password: "scrape_changeme"
  config:
    scrape_configs:
      - job_name: 'demo'
        scrape_interval: 5s

config.yaml (testdata)

config:
  scrape_configs:
    - job_name: 'kong'
      scrape_interval: 1s
      static_configs:
        - targets:
            - "localhost:18001"

Last generated: 2026-04-13