Skip to main content

Redaction Processor

Status Available in: contrib, k8s Maintainers: @dmitryax, @mx-psi, @TylerHelmuth, @iblancasa Source: opentelemetry-collector-contrib

Supported Telemetry

Logs Metrics Traces

Overview

Use Cases

Typical use-cases:
  • Prevent sensitive fields from accidentally leaking into traces
  • Ensure compliance with legal, privacy, or security requirements
For example:
  • EU General Data Protection Regulation (GDPR) prohibits the transfer of any personal data like birthdates, addresses, or ip addresses across borders without explicit consent from the data subject. Popular trace aggregation services are located in US, not in EU. You can use the redaction processor to scrub personal data from your data.
  • PRC legislation prohibits the transfer of geographic coordinates outside of the PRC. Popular trace aggregation services are located in US, not in the PRC. You can use the redaction processor to scrub geographic coordinates from your data.
  • Payment Card Industry (PCI) Data Security Standards prohibit logging certain things or storing them unencrypted. You can use the redaction processor to scrub them from your traces.
The above is written by an engineer, not a lawyer. The redaction processor is intended as one line of defence rather than the only compliance measure in place.

Processor Configuration

Please refer to config.go for the config spec. Examples:
processors:
  redaction:
    # allow_all_keys is a flag that disables the allowed_keys list when set to true.
    # The list of blocked_values is applied regardless. If you just want to block values, set this to true.
    allow_all_keys: false
    # allowed_keys is a list of span/log/datapoint attribute keys that are kept on the span/log/datapoint and
    # processed. The list is designed to fail closed. If allowed_keys is empty,
    # no attributes are allowed and all span attributes are removed. To
    # allow all keys, set allow_all_keys to true.
    allowed_keys:
      - description
      - group
      - id
      - name
    # Ignore the following attributes, allow them to pass without redaction.
    # Any keys in this list are allowed so they don't need to be in both lists.
    ignored_keys:
      - safe_attribute
    # ignored_key_patterns is a list of regular expressions for ignoring keys.
    # Keys matching any of these patterns are allowed to pass through without
    # their values being checked or modified.
    ignored_key_patterns:
      - "^safe_.*"
      - ".*_trusted$"
    # redact_all_types will check incoming fields for sensitive data based on their AsString() representation. This allows the processor to redact sensitive data from ints. This is useful for redacting credit card numbers
    redact_all_types: true
    # blocked_key_patterns is a list of blocked span attribute key patterns. Span attributes
    # matching the regexes on the list are masked.
    blocked_key_patterns:
      - ".*token.*"
      - ".*api_key.*"
    # blocked_values is a list of regular expressions for blocking values of
    # allowed span attributes. Values that match are masked
    blocked_values:
      - "4[0-9]{12}(?:[0-9]{3})?" ## Visa credit card number
      - "(5[1-5][0-9]{14})"       ## MasterCard number
    # AllowedValues is a list of regular expressions for allowing values of
    # blocked span attributes. Values that match are not masked.
    allowed_values:
      - "[email protected]"
    # hash_function defines the function for hashing the values instead of
    # masking them with a fixed string. By default, no hash function is used
    # and masking with a fixed string is performed.
    hash_function: md5
    # summary controls the verbosity level of the diagnostic attributes that
    # the processor adds to the spans/logs/datapoints when it redacts or masks other
    # attributes. In some contexts a list of redacted attributes leaks
    # information, while it is valuable when integrating and testing a new
    # configuration. Possible values:
    # - `debug` includes both redacted key counts and names in the summary
    # - `info` includes just the redacted key counts in the summary
    # - `silent` omits the summary attributes
    summary: debug
    # url_sanitizer configures URL sanitization to remove variable elements from the url, causing high cardinality issues
    url_sanitizer:
      # enabled controls whether URL sanitization is active
      enabled: true
      # attributes is a list of attribute keys that contain URLs to be sanitized
      attributes: ["http.url", "url"]
      # sanitize_span_name controls whether span names should be sanitized for URLs (default: true)
      # When enabled, span names containing "/" will be sanitized to reduce cardinality
      # Set to false to disable span name sanitization while keeping attribute sanitization active
      sanitize_span_name: true
Refer to config.yaml for how to fit the configuration into an OpenTelemetry Collector pipeline definition. Ignored attributes are processed first so they’re always allowed and never blocked. This field should only be used where you know the data is always safe to send to the telemetry system. You can use either ignored_keys for exact key matches or ignored_key_patterns for regex-based pattern matching. Only span/log/datapoint attributes included on the list of allowed keys list are retained. If allowed_keys is empty, then no attributes are allowed. All attributes are removed in that case. To keep all span attributes, you should explicitly set allow_all_keys to true. blocked_values and allowed_values applies to the values of the allowed keys. If the value of an allowed key matches the regular expression for an allowed value, the matching part of the value is not masked even if it matches the regular expression for a blocked value. If the value matches the regular expression for a blocked value only, the matching part of the value is masked with a fixed length of asterisks.

Precedence between allowed_values and blocked_values

When both allowed_values and blocked_values are configured, allowed_values takes precedence. This means that if a value matches an entry in allowed_values, it will not be masked even if it also matches blocked_values. This behavior is intentional and allows operators to explicitly whitelist known-safe values while still blocking broader patterns.

Example

processors:
  redaction:
    blocked_values:
      - "mycompany.com"
    allowed_values:
      - "support.mycompany.com"

blocked_key_patterns applies to the values of the keys matching one of the patterns. The value is then masked according to the configuration. hash_function defines the function for hashing values of matched keys or matches in values instead of masking them with a fixed string. By default, no hash function is used and masking with a fixed string is performed. The supported hash functions are md5, sha1, sha3 (SHA-256), hmac-sha256, and hmac-sha512.

HMAC Hash Functions

For enhanced security, especially when dealing with low-entropy data like IP addresses, HMAC (Hash-based Message Authentication Code) hash functions are recommended over simple hash functions like MD5, SHA1, or SHA3.

Configuration Example

processors:
  redaction:
    allow_all_keys: true
    blocked_values:
      - "(?:[0-9]{1,3}\\.){3}[0-9]{1,3}"  # IPv4 addresses
      - "(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}"  # IPv6 addresses
    hash_function: hmac-sha256  # or hmac-sha512
    hmac_key: "${env:REDACTION_SECRET_KEY}"  # Load from environment variable
    summary: silent

Audit Trail

When summary is set to debug or info, the processor appends diagnostic attributes to each span, log record, or metric datapoint describing the actions it took. Setting summary: silent suppresses all audit attributes.

Attribute-level audit (spans, logs, metric datapoints)

These attributes are added to the record’s attribute map:
AttributeinfodebugDescription
redaction.redacted.keysComma-separated list of attribute keys removed because they were not in allowed_keys
redaction.redacted.countNumber of attributes removed
redaction.masked.keysComma-separated list of attribute keys whose values matched a blocked_values pattern and were masked
redaction.masked.countNumber of attribute values masked
redaction.allowed.keysComma-separated list of attribute keys that passed through
redaction.allowed.countNumber of attributes allowed through
redaction.ignored.countNumber of attributes skipped due to ignored_keys or ignored_key_patterns

Log body audit

For log records whose body is a map, the processor additionally appends audit attributes into the body map itself:
AttributeinfodebugDescription
redaction.body.redacted.keysComma-separated list of body map keys removed
redaction.body.redacted.countNumber of body map keys removed
redaction.body.masked.keysComma-separated list of body map keys whose values were masked
redaction.body.masked.countNumber of body map values masked
redaction.body.allowed.keysComma-separated list of body map keys permitted
redaction.body.allowed.countNumber of body map keys allowed through
redaction.body.ignored.countNumber of body map keys ignored

Example

Given this configuration:
processors:
  redaction:
    allowed_keys:
      - description
      - email
    blocked_values:
      - "4[0-9]{12}(?:[0-9]{3})?"  ## Visa credit card number
    summary: debug
A span arriving with these attributes:
KeyValue
description"payment processed"
email"[email protected]"
credit_card"4111111111111111"
internal_id"abc-123"
Would be emitted with:
KeyValue
description"payment processed"
email"[email protected]"
redaction.redacted.keys"credit_card,internal_id"
redaction.redacted.count2
redaction.allowed.keys"description,email"
redaction.allowed.count2
Note that credit_card was removed (not masked) because it was not in allowed_keys — its value never reached the blocked_values check. If credit_card had been in allowed_keys, its value 4111111111111111 would have matched the Visa pattern and redaction.masked.keys would show "credit_card" instead. Attributes with a zero count (e.g. redaction.masked.count, redaction.ignored.count) are not emitted — the processor only writes audit attributes when at least one relevant action occurred.

URL Sanitization

The url_sanitizer configuration enables sanitization of URLs in specified attributes by removing potentially sensitive information like UUIDs, timestamps, and other non-essential path segments. This is particularly useful for reducing cardinality in telemetry data while preserving the essential parts of URLs for troubleshooting.

Span Name Sanitization

By default, when URL sanitization is enabled, span names for client and server span types that contain ”/” characters are automatically sanitized. This helps reduce cardinality issues caused by high-variability URL paths in span names while preserving essential routing information. You can control this behavior using the sanitize_span_name option:
  • true (default): Span names will be sanitized along with attributes
  • false: Only attributes are sanitized, span names remain unchanged
This option is available independently for both URL and database sanitization, allowing fine-grained control over which span names should be redacted. For example, if notes is on the list of allowed keys, then the notes attribute is retained. However, if there is a value such as a credit card number in the notes field that matched a regular expression on the list of blocked values, then that value is masked.

Database Query Sanitization

The redaction processor now supports sanitizing database queries and commands to remove sensitive information. This feature supports multiple database systems:
  • SQL databases
  • Redis
  • Memcached
  • MongoDB
  • OpenSearch
  • Elasticsearch
Example configuration with database sanitization:
processors:
  redaction:
    # ... other redaction settings ...

    # Database sanitization configuration
    db_sanitizer:
      # sanitize_span_name controls whether span names should be sanitized for database queries (default: true)
      # When enabled, span names will be obfuscated to remove sensitive query details
      # Set to false to disable span name sanitization while keeping attribute sanitization active
      sanitize_span_name: true
      sql:
        enabled: true
        attributes: ["db.statement", "db.query"]
      redis:
        enabled: true
        attributes: ["db.statement", "redis.command"]
      memcached:
        enabled: true
        attributes: ["db.statement", "memcached.command"]
      mongo:
        enabled: true
        attributes: ["db.statement", "mongodb.query"]
      opensearch:
        enabled: true
        attributes: ["db.statement", "opensearch.body"]
      es:
        enabled: true
        attributes: ["db.statement", "elasticsearch.body"]
The database sanitizer will:
  • Remove sensitive data like literal values from SQL queries
  • Redact command arguments from Redis/Memcached commands
  • Sanitize MongoDB queries and JSON payloads
  • Process only specified attributes if provided
  • Preserve query structure while removing sensitive data
  • Sanitize span names containing database queries (can be controlled with sanitize_span_name)
By default, database query sanitization also applies to span names for client span types. You can disable this behavior by setting sanitize_span_name: false in the db_sanitizer configuration, which allows you to keep original database query span names while still sanitizing the query values in attributes. This provides an additional layer of protection when collecting telemetry that includes database operations. Trace and metric behaviour: Database sanitization for spans and metric attributes only runs when the telemetry includes a db.system.name or db.system attribute and the span kind is CLIENT or SERVER. This prevents non-database spans from being rewritten. Logs automatically enable a sequential fallback internally, so database attributes without db.system can still be sanitized when they appear in log records.

Configuration

Example Configuration

redaction:
  # Flag to allow all span attribute keys. Setting this to true disables the
  # allowed_keys list. The list of blocked_values is applied regardless. If
  # you just want to block values, set this to true.
  allow_all_keys: false
  # Allowlist for span attribute keys. The list is designed to fail closed.
  # If allowed_keys is empty, no span attributes are allowed and all span
  # attributes are removed. To allow all keys, set allow_all_keys to true.
  # To allow the span attributes you know are good, add them to the list.
  allowed_keys:
    - description
    - group
    - id
    - name
  # Ignore the following attributes, allow them to pass without redaction.
  # Any keys in this list are allowed so they don't need to be in both lists.
  ignored_keys:
    - safe_attribute
  # blocked_key_patterns is a list of blocked span attribute key patterns. Span attributes
  # matching the regexes on the list are masked.
  blocked_key_patterns:
    - .*(token|api_key).*
  # BlockedValues is a list of regular expressions for blocking values of
  # allowed span attributes. Values that match are masked.
  blocked_values:
    - "4[0-9]{12}(?:[0-9]{3})?" ## Visa credit card number
    - "(5[1-5][0-9]{14})"       ## MasterCard number
  # allowed_values is a list of regular expressions for allowing values of
  # blocked span attributes. Values that match are not masked.
  allowed_values:
    - "[email protected]"
  # hash_function defines the function for hashing the values instead of
  # masking them with a fixed string. By default, no hash function is used
  # and masking with a fixed string is performed.
  hash_function: md5
  # Summary controls the verbosity level of the diagnostic attributes that
  # the processor adds to the spans when it redacts or masks other
  # attributes. In some contexts a list of redacted attributes leaks
  # information, while it is valuable when integrating and testing a new
  # configuration. Possible values are `debug`, `info`, and `silent`.
  summary: debug

redaction/empty:

Last generated: 2026-04-13