K8sattributes Processor
contrib, k8s
Maintainers: @dmitryax, @fatsheep9146, @TylerHelmuth, @ChrsMark, @odubajDT
Source: opentelemetry-collector-contrib
Supported Telemetry
Overview
The processor automatically discovers k8s resources (pods), extracts metadata from them and adds the extracted metadata to the relevant spans, metrics and logs as resource attributes. The processor uses the kubernetes API to discover all pods running in a cluster, keeps a record of their IP addresses, pod UIDs and interesting metadata. The rules for associating the data passing through the processor (spans, metrics and logs) with specific Pod Metadata are configured via “pod_association” key. It represents a list of associations that are executed in the specified order until the first one is able to do the match.Configuration
The processor stores the list of running pods and the associated metadata. When it sees a datapoint (log, trace or metric), it will try to associate the datapoint to the pod from where the datapoint originated, so we can add the relevant pod metadata to the datapoint. By default, it associates the incoming connection IP to the Pod IP. But for cases where this approach doesn’t work (sending through a proxy, etc.), a custom association rule can be specified. Each association is specified as a list of sources of associations. The maximum number of sources within an association is 4. A source is a rule that matches metadata from the datapoint to pod metadata. In order to get an association applied, all the sources specified need to match. Each sources rule is specified as a pair offrom (representing the rule type) and name (representing the attribute name if from is set to resource_attribute).
The following rule types are available:
connection: Takes the IP attribute from connection context (if available). In this case the processor must appear before any batching or tail sampling, which remove this information.resource_attribute: Allows specifying the attribute name to lookup in the list of attributes of the received Resource. Semantic convention should be used for naming.
metadata configuration that defines list of resource attributes
to be added. Items in the list called exactly the same as the resource attributes that will be added.
The following attributes are added by default:
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
- k8s.deployment.name (derived from the ReplicaSet name by default. Set the deprecated
deployment_name_from_replicasetoption tofalseto use the ReplicaSet informer for deployment name lookup.) - k8s.node.name
metadata section can also be extended with additional attributes which, if present in the metadata section,
are then also available for the use within association rules. Available attributes are:
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.hostname
- k8s.pod.ip
- k8s.pod.start_time
- k8s.pod.uid
- k8s.replicaset.uid
- k8s.replicaset.name
- k8s.deployment.uid
- k8s.deployment.name
- k8s.daemonset.uid
- k8s.daemonset.name
- k8s.statefulset.uid
- k8s.statefulset.name
- k8s.cronjob.uid
- k8s.cronjob.name (by default uses a Job-name heuristic when only the name is needed; the Job informer is used when
k8s.cronjob.uidis enabled and/or labels or annotations are extracted withfrom: job) - k8s.job.uid
- k8s.job.name
- k8s.node.name
- k8s.node.uid
- k8s.cluster.uid
- service.namespace
- service.name
- service.version(cannot be used for source rules in the pod_association when it’s calculated based on container’s image tag/digest)
- service.instance.id(cannot be used for source rules in the pod_association)
- Any tags extracted from the pod labels and annotations, as described in extracting attributes from pod labels and annotations
metadata should be used for
pod_association’s resource_attribute, because empty or non-existing values will be ignored.
Additional container level attributes can be extracted. If a pod contains more than one container,
either the container.id, or the k8s.container.name attribute must be provided in the incoming resource attributes to
correctly associate the matching container to the resource:
- If the
container.idresource attribute is provided, the following additional attributes will be available:- k8s.container.name
- container.image.name
- container.image.tag
- container.image.repo_digests (if k8s CRI populates repository digest field)
- service.version
- service.instance.id
- If the
k8s.container.nameresource attribute is provided, the following additional attributes will be available:- container.id (if the
k8s.container.restart_countresource attribute is not provided, it’s not guaranteed to get the right container ID.) - container.image.name
- container.image.tag
- container.image.repo_digests (if k8s CRI populates repository digest field)
- service.version
- service.instance.id
- container.id (if the
- If the
k8s.container.restart_countresource attribute is provided, it can be used to associate with a particular container instance. If it’s not set, the latest container instance will be used:- container.id (not added by default, has to be specified in
metadata)
- container.id (not added by default, has to be specified in
container.id attribute can be used for source rules in the pod_association. To use container.id in pod association, at least one container attribute must be included in the metadata extraction configuration (e.g., container.id, container.image.name, etc.).
Example for extracting container level attributes:
metadata section to all resources received by a matching pod with the k8s.container.name attribute being present. For example, when the following trace
wait_for_metadata option to true.
Then the processor will not be ready until the metadata is fully synced. As a result, the start-up of the Collector will be blocked. If the metadata cannot be synced, the Collector will ultimately fail to start.
If a timeout is reached, the processor will fail to start and return an error, which will cause the collector to exit.
The timeout defaults to 10s and can be configured with the wait_for_metadata_timeout option.
example for setting the processor to wait for metadata to be synced before it is ready:
Informer Cache Resync Period
Reprocessing the informer cache periodically (resyncing) enqueues all cached K8s objects back into event handlers. In large clusters (e.g., 100K pods), this causes significant CPU spikes, memory churn, and garbage collection overhead. Because resource state modifications are already pushed immediately via Kubernetes watch events, a resync period is almost entirely unnecessary.watch_sync_period(default: 5m): The resync period for K8s informers. You may set this to0sto disable resyncing completely (recommended for large clusters).
Pod Deletion Grace Period
After receiving a pod deletion event, the processor can keep the pod’s metadata in its lookup cache for a short period before eviction. This grace window ensures that delayed spans, metrics, or logs that belong to the deleted pod can still be correctly enriched.pod_delete_grace_period(default: 120s): The grace period to wait before deleting a pod’s metadata from the lookup cache after a deletion event.
Extracting attributes from pod labels and annotations
The k8sattributesprocessor can also set resource attributes from k8s labels and annotations of pods, namespaces, deployments, statefulsets, daemonsets, jobs and nodes. The config for associating the data passing through the processor (spans, metrics and logs) with specific Pod/Namespace/Deployment/StatefulSet/DaemonSet/Job/Node annotations/labels is configured via “annotations” and “labels” keys. This config represents a list of annotations/labels that are extracted from pods/namespaces/deployments/statefulsets/daemonsets/jobs/nodes and added to spans, metrics and logs. Each item is specified as a config of tag_name (representing the tag name to tag the spans with), key (representing the key used to extract value) and from (representing the kubernetes object used to extract the value). The “from” field has the following possible values: “pod”, “namespace”, “deployment”, “statefulset”, “daemonset”, “job” and “node” and defaults to “pod” if none is specified. By default, extracting metadata fromDeployments, StatefulSets, DaemonSets and Jobs is disabled. Enabling extraction of these metadata comes with an extra memory consumption cost.
A few examples to use this config are as follows:
Configuring recommended resource attributes
The processor can be configured to set the recommended resource attributes:-
otel_annotationswill translateresource.opentelemetry.io/footo thefooresource attribute, etc. -
deployment_name_from_replicasetis deprecated and will be removed in future releases. Deployment names are derived from ReplicaSet names by default by trimming the pod-template-hash suffix. Set this option tofalseonly to force ReplicaSet informer lookup for deployment names. Ifk8s.deployment.uidis included in theextract metadatasection, or deployment labels or annotations are being extracted (i.e. anyextract.labelsorextract.annotationsrule withfrom: deployment), then the Deployment/ReplicaSet informers are started and this setting is ignored. Important: You must still includek8s.deployment.name(orservice.name) in theextract.metadatasection for the deployment name to be extracted. The processor derives the deployment name from the ReplicaSet’s naming convention without requiring direct access to Deployment resources, but the extraction rules must be enabled. Take the following ownerReference of a pod managed by deployment for example:
opentelemetry-collector.
Note: When deployment names are derived from ReplicaSet names, in rare cases where deployment names are between 247 and 253 characters, Kubernetes may truncate the name in the ReplicaSet to fit the pod template hash suffix within the DNS subdomain limit (253 chars), causing the extracted k8s.deployment.name to be slightly truncated. If this affects your workloads, you can set deployment_name_from_replicaset: false or enable the k8s.deployment.uid attribute for accurate retrieval from the Kubernetes API, but at an extra cost in memory.
Also note that for CronJob names (k8s.cronjob.name) a similar pattern applies, but it uses the Job informer (not ReplicaSet) and there is no deployment_name_from_replicaset-style flag. With only k8s.cronjob.name in extract.metadata, the processor derives the CronJob name from the Job’s name using a heuristic (8-digit time suffix aligned with pod creation time) and does not start a Job informer. The Job informer is started when k8s.cronjob.uid is enabled, or when labels or annotations are extracted with from: job, in which case the CronJob name can be resolved from the API when available. That reduces RBAC needs and memory use when you only need the CronJob name (no jobs watch for that attribute alone).
Example:
Config example
Common Use Cases
Example 1: Basic Agent Deployment (DaemonSet)
Minimal configuration for an agent collecting telemetry from pods on the same node:Example 2: Gateway Deployment with Resource Attribute Association
Gateway configuration that receives telemetry from agents that have already added pod IP:Example 3: Production Deployment with Namespace Filtering
Configuration for monitoring a specific namespace with comprehensive metadata:Example 4: Memory-Optimized Configuration
Minimal memory footprint configuration for large clusters:Example 5: Multi-Container Pod Support
Configuration for extracting container-level metadata:Role-based access control
Cluster-scoped RBAC
If you’d like to set up the k8sattributesprocessor to receive telemetry from across namespaces, it will needget, watch and list permissions on both pods and namespaces resources, for all namespaces and pods included in the configured filters. Additionally, when using k8s.deployment.uid, when using k8s.deployment.name with the deprecated deployment_name_from_replicaset: false, or when extracting labels or annotations with from: deployment, the processor needs get, watch and list permissions for replicasets resources.
When using k8s.node.uid or extracting metadata from node, the processor needs get, watch and list permissions for nodes resources.
With only k8s.cronjob.name (and no k8s.cronjob.uid, and no label or annotation extraction with from: job), the processor does not need get, watch and list permissions for jobs resources.
When using k8s.cronjob.uid, or when extracting labels or annotations with from: job, the processor also needs get, watch and list permissions for jobs resources.
Here is an example of a ClusterRole to give a ServiceAccount the necessary permissions for all pods, nodes, and namespaces in the cluster (replace <OTEL_COL_NAMESPACE> with a namespace where collector is deployed):
Namespace-scoped RBAC
When running the k8sattributesprocessor to receive telemetry traffic from pods in a specific namespace, you can use a k8sRole and Rolebinding to provide collector access to query pods and replicasets in the namespace. This would require setting the filter::namespace config as shown below.
deployment_name_from_replicaset: false, k8s.deployment.uid, or deployment label/annotation extraction) in the selected namespace. Note that with just a role binding, the processor cannot query metadata such as labels and annotations from k8s nodes and namespaces which are cluster-scoped objects. This also means that the processor cannot set the value for k8s.cluster.uid attribute if enabled, since the k8s.cluster.uid attribute is set to the uid of the namespace kube-system which is not queryable with namespaced rbac.
Please note, when extracting the workload related attributes, these workloads need to be present in the Role with the correct permissions. For example, an extraction of k8s.deployment.label.* attributes, deployments need to be present in Role.
Example Role and RoleBinding to create in the namespace being watched.
Deployment scenarios
The processor can be used in collectors deployed both as an agent (Kubernetes DaemonSet) or as a gateway (Kubernetes Deployment).As an agent
When running as an agent, the processor detects IP addresses of pods sending spans, metrics or logs to the agent and uses this information to extract metadata from pods. When running as an agent, it is important to apply a discovery filter so that the processor only discovers pods from the same host that it is running on. Not using such a filter can result in unnecessary resource usage especially on very large clusters. Once the filter is applied, each processor will only query the k8s API for pods running on its own node. Node filter can be applied by setting thefilter.node config option to the name of a k8s node. While this works
as expected, it cannot be used to automatically filter pods by the same node that the processor is running on in
most cases as it is not know before hand which node a pod will be scheduled on. Luckily, kubernetes has a solution
for this called the downward API. To automatically filter pods by the node the processor is running on, you’ll need
to complete the following steps:
- Use the downward API to inject the node name as an environment variable. Add the following snippet under the pod env section of the OpenTelemetry container.
- Set “filter.node_from_env_var” to the name of the environment variable holding the node name.
As a gateway
When running as a gateway, the processor cannot correctly detect the IP address of the pods generating the telemetry data without any of the well-known IP attributes, when it receives them from an agent instead of receiving them directly from the pods. To workaround this issue, agents deployed with the k8s_attributes processor can be configured to detect the IP addresses and forward them along with the telemetry data resources. Collector can then match this IP address with k8s pods and enrich the records with the metadata. In order to set this up, you’ll need to complete the following steps:- Setup agents in passthrough mode Configure the agents’ k8s_attributes processors to run in passthrough mode.
- Configure the collector as usual No special configuration changes are needed to be made on the collector. It’ll automatically detect the IP address of spans, logs and metrics sent by the agents as well as directly by other services/pods.
Complete Configuration Options
Below is a comprehensive configuration example with all available options:Configuration Options Reference
Top-Level Options
| Option | Type | Default | Description |
|---|---|---|---|
auth_type | string | serviceAccount | Authentication method for K8s API: none, serviceAccount, or kubeConfig |
context | string | "" | K8s context to use (only when auth_type: kubeConfig) |
kube_api_qps | float32 | 5 | Max queries per second to K8s API. Increase if you see client-side throttling warnings |
kube_api_burst | int | 10 | Max burst of requests to K8s API. Increase if you see client-side throttling warnings |
passthrough | bool | false | Only add pod IP without extracting metadata (no K8s API calls) |
wait_for_metadata | bool | false | Block collector startup until metadata is synced |
wait_for_metadata_timeout | duration | 10s | Max wait time for metadata sync on startup |
watch_sync_period | duration | 5m | Resync period for K8s informers (0s disables resync completely) |
pod_delete_grace_period | duration | 120s | Grace period to wait before deleting pod metadata from the cache on deletion |
Extract Options
| Option | Type | Default | Description |
|---|---|---|---|
metadata | []string | See below | List of metadata fields to extract as resource attributes |
annotations | []FieldExtractConfig | [] | Pod/namespace/node annotations to extract |
labels | []FieldExtractConfig | [] | Pod/namespace/node labels to extract |
otel_annotations | bool | false | Extract OpenTelemetry resource attributes from pod annotations with prefix resource.opentelemetry.io/ |
deployment_name_from_replicaset | bool | true | Deprecated; will be removed in future releases. When true, ReplicaSet informer is not run for deployment names only, relying on a heuristic instead. |
k8s.namespace.namek8s.pod.namek8s.pod.uidk8s.pod.start_timek8s.deployment.namek8s.node.name
extract.metadata.
FieldExtractConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
tag_name | string | Auto-generated | Resource attribute name (supports regex backreferences with key_regex) |
key | string | "" | Exact annotation/label key to extract (mutually exclusive with key_regex) |
key_regex | string | "" | Regex pattern to match annotation/label keys (mutually exclusive with key) |
from | string | pod | Source to extract from: pod, namespace, deployment, statefulset, daemonset, job, or node |
Filter Options
| Option | Type | Default | Description |
|---|---|---|---|
node | string | "" | Filter pods by specific node name |
node_from_env_var | string | "" | Environment variable containing node name to filter by |
namespace | string | "" | Filter pods by specific namespace |
fields | []FieldFilterConfig | [] | Filter by K8s field selectors |
labels | []FieldFilterConfig | [] | Filter by K8s label selectors |
FieldFilterConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
key | string | Required | Field or label key |
value | string | "" | Field or label value |
op | string | equals | Operation: equals, not-equals (fields); equals, not-equals, exists, does-not-exist (labels) |
PodAssociationConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
sources | []AssociationSource | Required | List of sources to match (maximum 4, all must match) |
AssociationSource Options
| Option | Type | Default | Description |
|---|---|---|---|
from | string | Required | Source type: connection or resource_attribute |
name | string | Conditional | Resource attribute name (required when from: resource_attribute) |
Exclude Options
| Option | Type | Default | Description |
|---|---|---|---|
pods | []ExcludePodConfig | Default excludes | List of pods to exclude from processing |
ExcludePodConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
name | string | Required | Pod name pattern (regex) to exclude |
jaeger-agentjaeger-collector
Caveats
There are some edge-cases and scenarios where k8s_attributes will not work properly.Host networking mode
The processor cannot correct identify pods running in the host network mode and enriching telemetry data generated by such pods is not supported at the moment, unless the association rule is not based on IP attribute.As a sidecar
The processor does not support detecting containers from the same pods when running as a sidecar. While this can be done, we think it is simpler to just use the kubernetes downward API to inject environment variables into the pods and directly use their values as tags.Compatibility
Kubernetes Versions
This processor is tested against the Kubernetes versions specified in the e2e-tests.yml workflow. These tested versions represent the officially supported Kubernetes versions for this component.Production Deployment Guide
Scaling Considerations
Memory Consumption
The processor maintains an in-memory cache of K8s metadata for all pods it monitors. Memory usage scales with:- Number of pods monitored: Each pod’s metadata (labels, annotations, owner references) is cached
- Metadata fields extracted: More fields = more memory per pod
- Label/annotation extraction rules: Regex patterns and multiple rules increase overhead
- Workload metadata: Extracting deployment/statefulset/daemonset/job metadata adds additional caching
- Agent mode (node-filtered): ~50-200 MB for 100 pods per node
- Gateway mode (cluster-wide): ~500 MB - 2 GB for 1000-10000 pods
- With workload metadata: Add 20-30% overhead
- Use node filtering in agent deployments:
filter.node_from_env_var: KUBE_NODE_NAME - Limit metadata extraction: Only extract fields you need
- Rely on the default deployment-name heuristic: Reduces memory by not caching replicaset data
- Filter by namespace: Limits scope when monitoring specific applications
- Avoid extracting workload metadata unless necessary (deployment, statefulset, etc.)
CPU Usage
CPU usage is generally low but increases with:- High telemetry throughput: Each data point requires pod lookup and attribute enrichment
- Frequent pod churn: More K8s API watch events to process
- Complex association rules: Multiple rules with many sources
- Agent mode: 100-500m CPU, 256-512 Mi memory
- Gateway mode: 500m-2 CPU, 1-4 Gi memory
High Availability
For gateway deployments, run multiple replicas with:- Load balancer distributing telemetry traffic
- Each replica independently queries K8s API and maintains its own cache
- No shared state between replicas
- Horizontal scaling based on CPU/memory usage
Graceful Shutdown
The processor is stateless and requires no special shutdown procedures:- Collector receives SIGTERM
- Processor stops watching K8s API
- In-flight telemetry data is processed
- Collector shuts down cleanly
Timestamp Format
By default, thek8s.pod.start_time uses Time.MarshalText() to format the
timestamp value as an RFC3339 compliant timestamp.
Self-Observability Features
The processor exposes internal telemetry metrics for monitoring its operation. For a complete list of all available metrics, see the Internal Telemetry documentation. Key metrics to monitor:otelcol_otelsvc_k8s_ip_lookup_miss: Number of times pod lookup by IP failed- High values indicate association issues
otelcol_otelsvc_k8s_pod_added/otelcol_otelsvc_k8s_pod_deleted: Track pod churn rates- Monitor for unexpected spikes in pod lifecycle events
otelcol_otelsvc_k8s_pod_table_size: Current size of pod metadata cache- Use to monitor memory consumption trends
Warnings
- Memory consumption: Since the processor fetches and caches the K8s metadata for the resources of the node it is on, it consumes more memory than other processors. That consumption is compounded if users don’t filter down to only the metadata for the node the processor is running on.
Feature Gates
See documentation.md for the complete list of feature gates supported by this processor. Feature gates can be enabled using the--feature-gates flag:
Semantic Conventions Compatibility
The processor is moving towards the latest Semantic Conventions through the following feature gates:processor.k8sattributes.DontEmitV0K8sConventionsprocessor.k8sattributes.EmitV1K8sConventions
container.image.tag->container.image.tagsk8s.pod.labels.<key>->k8s.pod.label.<key>k8s.pod.annotations.<key>->k8s.pod.annotation.<key>k8s.node.labels.<key>->k8s.node.label.<key>k8s.node.annotations.<key>->k8s.node.annotation.<key>k8s.namespace.labels.<key>->k8s.namespace.label.<key>k8s.namespace.annotations.<key>->k8s.namespace.annotation.<key>
processor.k8sattributes.EmitV1K8sConventions feature gate
are currently in release_candidate stability and are actively moving towards stable stability.
Available Benchmarks
The component is tested as part of the project’s load tests, with the results being publicly available at the benchmarks page. In that page, users can find details such as memory and CPU performance when the component is used in K8s Clusters (tests use KWOK) with a range number of workloads. Refer to the test for more information about the setup.Attributes
| Attribute Name | Description | Type | Values |
|---|---|---|---|
otelcol.signal | The signal type the telemetry metric is associated with | string | metrics, traces, logs, profiles |
pod_identifier | The source(s) used to identify the pod, formatted as ‘from/name’ (e.g., ‘connection’, ‘resource_attribute/k8s.pod.ip’). Does not contain actual identifier values to avoid high cardinality. | string | |
status | The status of the pod association operation | string | success, error |
Resource Attributes
| Attribute Name | Description | Type | Enabled |
|---|---|---|---|
container.id | Container ID. Usually a UUID, as for example used to identify Docker containers. The UUID might be abbreviated. Requires k8s.container.restart_count. | string | ❌ |
container.image.name | Name of the image the container was built on. Requires container.id or k8s.container.name. | string | ✅ |
container.image.repo_digests | Repo digests of the container image as provided by the container runtime. | slice | ❌ |
container.image.tag | Container image tag. Defaults to “latest” if not provided (unless digest also in image path) Requires container.id or k8s.container.name. Deprecated, use container.image.tags instead. | string | ✅ |
container.image.tags | Container image tags. Defaults to “latest” if not provided (unless digest also in image path). Requires container.id or k8s.container.name. | slice | ✅ |
k8s.cluster.uid | Gives cluster uid identified with kube-system namespace | string | ❌ |
k8s.container.name | The name of the Container in a Pod template. Requires container.id. | string | ❌ |
k8s.cronjob.name | The name of the CronJob. | string | ❌ |
k8s.cronjob.uid | The uid of the CronJob. | string | ❌ |
k8s.daemonset.name | The name of the DaemonSet. | string | ❌ |
k8s.daemonset.uid | The UID of the DaemonSet. | string | ❌ |
k8s.deployment.name | The name of the Deployment. | string | ✅ |
k8s.deployment.uid | The UID of the Deployment. | string | ❌ |
k8s.job.name | The name of the Job. | string | ❌ |
k8s.job.uid | The UID of the Job. | string | ❌ |
k8s.namespace.name | The name of the namespace that the pod is running in. | string | ✅ |
k8s.node.name | The name of the Node. | string | ✅ |
k8s.node.uid | The UID of the Node. | string | ❌ |
k8s.pod.hostname | The hostname of the Pod. | string | ❌ |
k8s.pod.ip | The IP address of the Pod. | string | ❌ |
k8s.pod.name | The name of the Pod. | string | ✅ |
k8s.pod.start_time | The start time of the Pod. | string | ✅ |
k8s.pod.uid | The UID of the Pod. | string | ✅ |
k8s.replicaset.name | The name of the ReplicaSet. | string | ❌ |
k8s.replicaset.uid | The UID of the ReplicaSet. | string | ❌ |
k8s.statefulset.name | The name of the StatefulSet. | string | ❌ |
k8s.statefulset.uid | The UID of the StatefulSet. | string | ❌ |
service.instance.id | The instance ID of the service. | string | ❌ |
service.name | The name of the service. | string | ❌ |
service.namespace | The namespace of the service. | string | ❌ |
service.version | The version of the service. | string | ❌ |
Configuration
Example Configuration
Last generated: 2026-06-01