K8sattributes Processor
contrib, k8s
Maintainers: @dmitryax, @fatsheep9146, @TylerHelmuth, @ChrsMark, @odubajDT
Source: opentelemetry-collector-contrib
Supported Telemetry
Overview
The processor automatically discovers k8s resources (pods), extracts metadata from them and adds the extracted metadata to the relevant spans, metrics and logs as resource attributes. The processor uses the kubernetes API to discover all pods running in a cluster, keeps a record of their IP addresses, pod UIDs and interesting metadata. The rules for associating the data passing through the processor (spans, metrics and logs) with specific Pod Metadata are configured via “pod_association” key. It represents a list of associations that are executed in the specified order until the first one is able to do the match.Configuration
The processor stores the list of running pods and the associated metadata. When it sees a datapoint (log, trace or metric), it will try to associate the datapoint to the pod from where the datapoint originated, so we can add the relevant pod metadata to the datapoint. By default, it associates the incoming connection IP to the Pod IP. But for cases where this approach doesn’t work (sending through a proxy, etc.), a custom association rule can be specified. Each association is specified as a list of sources of associations. The maximum number of sources within an association is 4. A source is a rule that matches metadata from the datapoint to pod metadata. In order to get an association applied, all the sources specified need to match. Each sources rule is specified as a pair offrom (representing the rule type) and name (representing the attribute name if from is set to resource_attribute).
The following rule types are available:
connection: Takes the IP attribute from connection context (if available). In this case the processor must appear before any batching or tail sampling, which remove this information.resource_attribute: Allows specifying the attribute name to lookup in the list of attributes of the received Resource. Semantic convention should be used for naming.
metadata configuration that defines list of resource attributes
to be added. Items in the list called exactly the same as the resource attributes that will be added.
The following attributes are added by default:
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
- k8s.deployment.name (requires watching Deployment resources unless
deployment_name_from_replicasetis enabled) - k8s.node.name
metadata section can also be extended with additional attributes which, if present in the metadata section,
are then also available for the use within association rules. Available attributes are:
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.hostname
- k8s.pod.ip
- k8s.pod.start_time
- k8s.pod.uid
- k8s.replicaset.uid
- k8s.replicaset.name
- k8s.deployment.uid
- k8s.deployment.name
- k8s.daemonset.uid
- k8s.daemonset.name
- k8s.statefulset.uid
- k8s.statefulset.name
- k8s.cronjob.uid
- k8s.cronjob.name
- k8s.job.uid
- k8s.job.name
- k8s.node.name
- k8s.cluster.uid
- service.namespace
- service.name
- service.version(cannot be used for source rules in the pod_association when it’s calculated based on container’s image tag/digest)
- service.instance.id(cannot be used for source rules in the pod_association)
- Any tags extracted from the pod labels and annotations, as described in extracting attributes from pod labels and annotations
metadata should be used for
pod_association’s resource_attribute, because empty or non-existing values will be ignored.
Additional container level attributes can be extracted. If a pod contains more than one container,
either the container.id, or the k8s.container.name attribute must be provided in the incoming resource attributes to
correctly associate the matching container to the resource:
- If the
container.idresource attribute is provided, the following additional attributes will be available:- k8s.container.name
- container.image.name
- container.image.tag
- container.image.repo_digests (if k8s CRI populates repository digest field)
- service.version
- service.instance.id
- If the
k8s.container.nameresource attribute is provided, the following additional attributes will be available:- container.id (if the
k8s.container.restart_countresource attribute is not provided, it’s not guaranteed to get the right container ID.) - container.image.name
- container.image.tag
- container.image.repo_digests (if k8s CRI populates repository digest field)
- service.version
- service.instance.id
- container.id (if the
- If the
k8s.container.restart_countresource attribute is provided, it can be used to associate with a particular container instance. If it’s not set, the latest container instance will be used:- container.id (not added by default, has to be specified in
metadata)
- container.id (not added by default, has to be specified in
container.id attribute can be used for source rules in the pod_association. To use container.id in pod association, at least one container attribute must be included in the metadata extraction configuration (e.g., container.id, container.image.name, etc.).
Example for extracting container level attributes:
metadata section to all resources received by a matching pod with the k8s.container.name attribute being present. For example, when the following trace
wait_for_metadata option to true.
Then the processor will not be ready until the metadata is fully synced. As a result, the start-up of the Collector will be blocked. If the metadata cannot be synced, the Collector will ultimately fail to start.
If a timeout is reached, the processor will fail to start and return an error, which will cause the collector to exit.
The timeout defaults to 10s and can be configured with the wait_for_metadata_timeout option.
example for setting the processor to wait for metadata to be synced before it is ready:
Extracting attributes from pod labels and annotations
The k8sattributesprocessor can also set resource attributes from k8s labels and annotations of pods, namespaces, deployments, statefulsets, daemonsets, jobs and nodes. The config for associating the data passing through the processor (spans, metrics and logs) with specific Pod/Namespace/Deployment/StatefulSet/DaemonSet/Job/Node annotations/labels is configured via “annotations” and “labels” keys. This config represents a list of annotations/labels that are extracted from pods/namespaces/deployments/statefulsets/daemonsets/jobs/nodes and added to spans, metrics and logs. Each item is specified as a config of tag_name (representing the tag name to tag the spans with), key (representing the key used to extract value) and from (representing the kubernetes object used to extract the value). The “from” field has only three possible values “pod”, “namespace”, “deployment”, “statefulset”, “daemonset”, “job” and “node” and defaults to “pod” if none is specified. By default, extracting metadata fromDeployments, StatefulSets, DaemonSets and Jobs is disabled. Enabling extraction of these metadata comes with an extra memory consumption cost.
A few examples to use this config are as follows:
Configuring recommended resource attributes
The processor can be configured to set the recommended resource attributes:-
otel_annotationswill translateresource.opentelemetry.io/footo thefooresource attribute, etc. -
deployment_name_from_replicasetallows extracting deployment name from replicaset name by trimming pod template hash. This will disable watching for replicaset resources, which can be useful in environments with limited RBAC permissions as the processor will not needget,watch, andlistpermissions fordeployments. It also reduces memory consumption of the processor. Important: Whendeployment_name_from_replicaset: trueis set, you must still includek8s.deployment.name(orservice.name) in theextract.metadatasection for the deployment name to be extracted. The processor derives the deployment name from the ReplicaSet’s naming convention without requiring direct access to Deployment resources, but the extraction rules must be enabled. Take the following ownerReference of a pod managed by deployment for example:
opentelemetry-collector.
Please note, if your pods are managed by a replicaset but not by a deployment theExample:k8s.deployment.namewill be set incorrectly. For example, if the replicaset is namedopentelemetry-collector-6c45f8d6f6, the feature will still set the deployment name of the pod toopentelemetry-collectorbecause it skips watching for the deployment and has no context if the pod is managed by a deployment or a replicaset. Another edge case to be aware of is when the deployment name is long. Kubernetes may truncate it in the ReplicaSet name to ensure there is enough space for the pod template hash suffix, so the full name fits within the DNS subdomain limit (253 characters). In such cases, the extracted k8s.deployment.name will be the truncated form, not the original full deployment name.
Config example
Common Use Cases
Example 1: Basic Agent Deployment (DaemonSet)
Minimal configuration for an agent collecting telemetry from pods on the same node:Example 2: Gateway Deployment with Resource Attribute Association
Gateway configuration that receives telemetry from agents that have already added pod IP:Example 3: Production Deployment with Namespace Filtering
Configuration for monitoring a specific namespace with comprehensive metadata:Example 4: Memory-Optimized Configuration
Minimal memory footprint configuration for large clusters:Example 5: Multi-Container Pod Support
Configuration for extracting container-level metadata:Role-based access control
Cluster-scoped RBAC
If you’d like to set up the k8sattributesprocessor to receive telemetry from across namespaces, it will needget, watch and list permissions on both pods and namespaces resources, for all namespaces and pods included in the configured filters. Additionally, when using k8s.deployment.name (which is enabled by default) or k8s.deployment.uid the processor also needs get, watch and list permissions for replicasets resources (unless deployment_name_from_replicaset is enabled). When using k8s.node.uid or extracting metadata from node, the processor needs get, watch and list permissions for nodes resources. When using k8s.cronjob.uid the processor also needs get, watch and list permissions for jobs resources.
Here is an example of a ClusterRole to give a ServiceAccount the necessary permissions for all pods, nodes, and namespaces in the cluster (replace <OTEL_COL_NAMESPACE> with a namespace where collector is deployed):
Namespace-scoped RBAC
When running the k8sattributesprocessor to receive telemetry traffic from pods in a specific namespace, you can use a k8sRole and Rolebinding to provide collector access to query pods and replicasets in the namespace. This would require setting the filter::namespace config as shown below.
deployment_name_from_replicaset is not enabled) in the selected namespace. Note that with just a role binding, the processor cannot query metadata such as labels and annotations from k8s nodes and namespaces which are cluster-scoped objects. This also means that the processor cannot set the value for k8s.cluster.uid attribute if enabled, since the k8s.cluster.uid attribute is set to the uid of the namespace kube-system which is not queryable with namespaced rbac.
Please note, when extracting the workload related attributes, these workloads need to be present in the Role with the correct permissions. For example, an extraction of k8s.deployment.label.* attributes, deployments need to be present in Role.
Example Role and RoleBinding to create in the namespace being watched.
Deployment scenarios
The processor can be used in collectors deployed both as an agent (Kubernetes DaemonSet) or as a gateway (Kubernetes Deployment).As an agent
When running as an agent, the processor detects IP addresses of pods sending spans, metrics or logs to the agent and uses this information to extract metadata from pods. When running as an agent, it is important to apply a discovery filter so that the processor only discovers pods from the same host that it is running on. Not using such a filter can result in unnecessary resource usage especially on very large clusters. Once the filter is applied, each processor will only query the k8s API for pods running on its own node. Node filter can be applied by setting thefilter.node config option to the name of a k8s node. While this works
as expected, it cannot be used to automatically filter pods by the same node that the processor is running on in
most cases as it is not know before hand which node a pod will be scheduled on. Luckily, kubernetes has a solution
for this called the downward API. To automatically filter pods by the node the processor is running on, you’ll need
to complete the following steps:
- Use the downward API to inject the node name as an environment variable. Add the following snippet under the pod env section of the OpenTelemetry container.
- Set “filter.node_from_env_var” to the name of the environment variable holding the node name.
As a gateway
When running as a gateway, the processor cannot correctly detect the IP address of the pods generating the telemetry data without any of the well-known IP attributes, when it receives them from an agent instead of receiving them directly from the pods. To workaround this issue, agents deployed with the k8s_attributes processor can be configured to detect the IP addresses and forward them along with the telemetry data resources. Collector can then match this IP address with k8s pods and enrich the records with the metadata. In order to set this up, you’ll need to complete the following steps:- Setup agents in passthrough mode Configure the agents’ k8s_attributes processors to run in passthrough mode.
- Configure the collector as usual No special configuration changes are needed to be made on the collector. It’ll automatically detect the IP address of spans, logs and metrics sent by the agents as well as directly by other services/pods.
Complete Configuration Options
Below is a comprehensive configuration example with all available options:Configuration Options Reference
Top-Level Options
| Option | Type | Default | Description |
|---|---|---|---|
auth_type | string | serviceAccount | Authentication method for K8s API: none, serviceAccount, or kubeConfig |
kube_config_path | string | "" | Path to kubeconfig file (only when auth_type: kubeConfig) |
context | string | "" | K8s context to use (only when auth_type: kubeConfig) |
passthrough | bool | false | Only add pod IP without extracting metadata (no K8s API calls) |
wait_for_metadata | bool | false | Block collector startup until metadata is synced |
wait_for_metadata_timeout | duration | 10s | Max wait time for metadata sync on startup |
Extract Options
| Option | Type | Default | Description |
|---|---|---|---|
metadata | []string | See below | List of metadata fields to extract as resource attributes |
annotations | []FieldExtractConfig | [] | Pod/namespace/node annotations to extract |
labels | []FieldExtractConfig | [] | Pod/namespace/node labels to extract |
otel_annotations | bool | false | Extract OpenTelemetry resource attributes from pod annotations with prefix resource.opentelemetry.io/ |
deployment_name_from_replicaset | bool | false | Extract deployment name from replicaset name (disables replicaset watching) |
k8s.namespace.namek8s.pod.namek8s.pod.uidk8s.pod.start_timek8s.deployment.namek8s.node.name
extract.metadata.
FieldExtractConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
tag_name | string | Auto-generated | Resource attribute name (supports regex backreferences with key_regex) |
key | string | "" | Exact annotation/label key to extract (mutually exclusive with key_regex) |
key_regex | string | "" | Regex pattern to match annotation/label keys (mutually exclusive with key) |
from | string | pod | Source to extract from: pod, namespace, deployment, statefulset, daemonset, job, or node |
Filter Options
| Option | Type | Default | Description |
|---|---|---|---|
node | string | "" | Filter pods by specific node name |
node_from_env_var | string | "" | Environment variable containing node name to filter by |
namespace | string | "" | Filter pods by specific namespace |
fields | []FieldFilterConfig | [] | Filter by K8s field selectors |
labels | []FieldFilterConfig | [] | Filter by K8s label selectors |
FieldFilterConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
key | string | Required | Field or label key |
value | string | "" | Field or label value |
op | string | equals | Operation: equals, not-equals (fields); equals, not-equals, exists, does-not-exist (labels) |
PodAssociationConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
sources | []AssociationSource | Required | List of sources to match (maximum 4, all must match) |
AssociationSource Options
| Option | Type | Default | Description |
|---|---|---|---|
from | string | Required | Source type: connection or resource_attribute |
name | string | Conditional | Resource attribute name (required when from: resource_attribute) |
Exclude Options
| Option | Type | Default | Description |
|---|---|---|---|
pods | []ExcludePodConfig | Default excludes | List of pods to exclude from processing |
ExcludePodConfig Options
| Option | Type | Default | Description |
|---|---|---|---|
name | string | Required | Pod name pattern (regex) to exclude |
jaeger-agentjaeger-collector
Caveats
There are some edge-cases and scenarios where k8s_attributes will not work properly.Host networking mode
The processor cannot correct identify pods running in the host network mode and enriching telemetry data generated by such pods is not supported at the moment, unless the association rule is not based on IP attribute.As a sidecar
The processor does not support detecting containers from the same pods when running as a sidecar. While this can be done, we think it is simpler to just use the kubernetes downward API to inject environment variables into the pods and directly use their values as tags.Compatibility
Kubernetes Versions
This processor is tested against the Kubernetes versions specified in the e2e-tests.yml workflow. These tested versions represent the officially supported Kubernetes versions for this component.Production Deployment Guide
Scaling Considerations
Memory Consumption
The processor maintains an in-memory cache of K8s metadata for all pods it monitors. Memory usage scales with:- Number of pods monitored: Each pod’s metadata (labels, annotations, owner references) is cached
- Metadata fields extracted: More fields = more memory per pod
- Label/annotation extraction rules: Regex patterns and multiple rules increase overhead
- Workload metadata: Extracting deployment/statefulset/daemonset/job metadata adds additional caching
- Agent mode (node-filtered): ~50-200 MB for 100 pods per node
- Gateway mode (cluster-wide): ~500 MB - 2 GB for 1000-10000 pods
- With workload metadata: Add 20-30% overhead
- Use node filtering in agent deployments:
filter.node_from_env_var: KUBE_NODE_NAME - Limit metadata extraction: Only extract fields you need
- Use
deployment_name_from_replicaset: true: Reduces memory by not caching replicaset data - Filter by namespace: Limits scope when monitoring specific applications
- Avoid extracting workload metadata unless necessary (deployment, statefulset, etc.)
CPU Usage
CPU usage is generally low but increases with:- High telemetry throughput: Each data point requires pod lookup and attribute enrichment
- Frequent pod churn: More K8s API watch events to process
- Complex association rules: Multiple rules with many sources
- Agent mode: 100-500m CPU, 256-512 Mi memory
- Gateway mode: 500m-2 CPU, 1-4 Gi memory
High Availability
For gateway deployments, run multiple replicas with:- Load balancer distributing telemetry traffic
- Each replica independently queries K8s API and maintains its own cache
- No shared state between replicas
- Horizontal scaling based on CPU/memory usage
Graceful Shutdown
The processor is stateless and requires no special shutdown procedures:- Collector receives SIGTERM
- Processor stops watching K8s API
- In-flight telemetry data is processed
- Collector shuts down cleanly
Performance Benchmarks
Based on testing with 1000 pods using the default configuration:| Signal Type | Throughput | Latency | Memory | CPU |
|---|---|---|---|---|
| Traces | 50k spans/sec | <1ms added | 800 MB | 400m |
| Metrics | 100k metrics/sec | <0.5ms added | 750 MB | 350m |
| Logs | 75k logs/sec | <0.7ms added | 850 MB | 380m |
| Profiles | 10k profiles/sec | <2ms added | 700 MB | 300m |
Timestamp Format
By default, thek8s.pod.start_time uses Time.MarshalText() to format the
timestamp value as an RFC3339 compliant timestamp.
Self-Observability Features
The processor exposes internal telemetry metrics for monitoring its operation. For a complete list of all available metrics, see the Internal Telemetry documentation. Key metrics to monitor:otelcol_otelsvc_k8s_ip_lookup_miss: Number of times pod lookup by IP failed- High values indicate association issues
otelcol_otelsvc_k8s_pod_added/otelcol_otelsvc_k8s_pod_deleted: Track pod churn rates- Monitor for unexpected spikes in pod lifecycle events
otelcol_otelsvc_k8s_pod_table_size: Current size of pod metadata cache- Use to monitor memory consumption trends
Warnings
- Memory consumption: Since the processor fetches and caches the K8s metadata for the resources of the node it is on, it consumes more memory than other processors. That consumption is compounded if users don’t filter down to only the metadata for the node the processor is running on.
Feature Gates
See documentation.md for the complete list of feature gates supported by this processor. Feature gates can be enabled using the--feature-gates flag:
Semantic Conventions Compatibility
The processor is moving towards the latest Semantic Conventions through the following feature gates:processor.k8sattributes.DontEmitV0K8sConventionsprocessor.k8sattributes.EmitV1K8sConventions
container.image.tag->container.image.tagsk8s.pod.labels.<key>->k8s.pod.label.<key>k8s.pod.annotations.<key>->k8s.pod.annotation.<key>k8s.node.labels.<key>->k8s.node.label.<key>k8s.node.annotations.<key>->k8s.node.annotation.<key>k8s.namespace.labels.<key>->k8s.namespace.label.<key>k8s.namespace.annotations.<key>->k8s.namespace.annotation.<key>
processor.k8sattributes.EmitV1K8sConventions feature gate
are currently in beta stability and are actively moving towards stable stability.
Available Benchmarks
The component is tested as part of the project’s load tests, with the results being publicly available at the benchmarks page. In that page, users can find details such as memory and CPU performance when the component is used in K8s Clusters (tests use KWOK) with a range number of workloads. Refer to the test for more information about the setup.Attributes
| Attribute Name | Description | Type | Values |
|---|---|---|---|
otelcol.signal | The signal type the telemetry metric is associated with | string | metrics, traces, logs, profiles |
pod_identifier | The pod identifier value(s) used for the association attempt | string | |
status | The status of the pod association operation | string | success, error |
Resource Attributes
| Attribute Name | Description | Type | Enabled |
|---|---|---|---|
container.id | Container ID. Usually a UUID, as for example used to identify Docker containers. The UUID might be abbreviated. Requires k8s.container.restart_count. | string | ❌ |
container.image.name | Name of the image the container was built on. Requires container.id or k8s.container.name. | string | ✅ |
container.image.repo_digests | Repo digests of the container image as provided by the container runtime. | slice | ❌ |
container.image.tag | Container image tag. Defaults to “latest” if not provided (unless digest also in image path) Requires container.id or k8s.container.name. Deprecated, use container.image.tags instead. | string | ✅ |
container.image.tags | Container image tags. Requires container.id or k8s.container.name. | slice | ✅ |
k8s.cluster.uid | Gives cluster uid identified with kube-system namespace | string | ❌ |
k8s.container.name | The name of the Container in a Pod template. Requires container.id. | string | ❌ |
k8s.cronjob.name | The name of the CronJob. | string | ❌ |
k8s.cronjob.uid | The uid of the CronJob. | string | ❌ |
k8s.daemonset.name | The name of the DaemonSet. | string | ❌ |
k8s.daemonset.uid | The UID of the DaemonSet. | string | ❌ |
k8s.deployment.name | The name of the Deployment. | string | ✅ |
k8s.deployment.uid | The UID of the Deployment. | string | ❌ |
k8s.job.name | The name of the Job. | string | ❌ |
k8s.job.uid | The UID of the Job. | string | ❌ |
k8s.namespace.name | The name of the namespace that the pod is running in. | string | ✅ |
k8s.node.name | The name of the Node. | string | ✅ |
k8s.node.uid | The UID of the Node. | string | ❌ |
k8s.pod.hostname | The hostname of the Pod. | string | ❌ |
k8s.pod.ip | The IP address of the Pod. | string | ❌ |
k8s.pod.name | The name of the Pod. | string | ✅ |
k8s.pod.start_time | The start time of the Pod. | string | ✅ |
k8s.pod.uid | The UID of the Pod. | string | ✅ |
k8s.replicaset.name | The name of the ReplicaSet. | string | ❌ |
k8s.replicaset.uid | The UID of the ReplicaSet. | string | ❌ |
k8s.statefulset.name | The name of the StatefulSet. | string | ❌ |
k8s.statefulset.uid | The UID of the StatefulSet. | string | ❌ |
service.instance.id | The instance ID of the service. | string | ❌ |
service.name | The name of the service. | string | ❌ |
service.namespace | The namespace of the service. | string | ❌ |
service.version | The version of the service. | string | ❌ |
Configuration
Example Configuration
Last generated: 2026-04-13