Documentation Index
Fetch the complete documentation index at: https://otel.fyi/llms.txt
Use this file to discover all available pages before exploring further.
Apachespark Receiver
contrib
Maintainers: @Caleb-Hurshman, @mrsillydog
Source: opentelemetry-collector-contrib
Supported Telemetry
Overview
Purpose
The purpose of this component is to monitor Apache Spark clusters and the applications running on them through the collection of performance metrics like memory utilization, CPU utilization, shuffle operations, garbage collection time, I/O operations, and more.Prerequisites
This receiver supports Apache Spark versions:- 3.3.2+
Configuration
These configuration options are for connecting to an Apache Spark application. The following settings are optional:collection_interval: (default =60s): This receiver collects metrics on an interval. This value must be a string readable by Golang’s time.ParseDuration. Valid time units arens,us(orµs),ms,s,m,h.initial_delay(default =1s): defines how long this receiver waits before starting.endpoint: (default =http://localhost:4040): Apache Spark endpoint to connect to in the form of[http][://]{host}[:{port}]application_names: An array of Spark application names for which metrics should be collected. If no application names are specified, metrics will be collected for all Spark applications running on the cluster at the specified endpoint.
Example Configuration
Metrics
Details about the metrics produced by this receiver can be found in metadata.yamlMetrics
| Metric Name | Description | Unit | Type | Attributes |
|---|---|---|---|---|
✅ spark.driver.block_manager.disk.usage | Disk space used by the BlockManager. | mb | UpDownCounter | |
✅ spark.driver.block_manager.memory.usage | Memory usage for the driver’s BlockManager. | mb | UpDownCounter | location, state |
✅ spark.driver.code_generator.compilation.average_time | Average time spent during CodeGenerator source code compilation operations. | ms | Gauge | |
✅ spark.driver.code_generator.compilation.count | Number of source code compilation operations performed by the CodeGenerator. | { compilation } | Counter | |
✅ spark.driver.code_generator.generated_class.average_size | Average class size of the classes generated by the CodeGenerator. | bytes | Gauge | |
✅ spark.driver.code_generator.generated_class.count | Number of classes generated by the CodeGenerator. | { class } | Counter | |
✅ spark.driver.code_generator.generated_method.average_size | Average method size of the classes generated by the CodeGenerator. | bytes | Gauge | |
✅ spark.driver.code_generator.generated_method.count | Number of methods generated by the CodeGenerator. | { method } | Counter | |
✅ spark.driver.code_generator.source_code.average_size | Average size of the source code generated by a CodeGenerator code generation operation. | bytes | Gauge | |
✅ spark.driver.code_generator.source_code.operations | Number of source code generation operations performed by the CodeGenerator. | { operation } | Counter | |
✅ spark.driver.dag_scheduler.job.active | Number of active jobs currently being processed by the DAGScheduler. | { job } | UpDownCounter | |
✅ spark.driver.dag_scheduler.job.count | Number of jobs that have been submitted to the DAGScheduler. | { job } | Counter | |
✅ spark.driver.dag_scheduler.stage.count | Number of stages the DAGScheduler is either running or needs to run. | { stage } | UpDownCounter | scheduler_status |
✅ spark.driver.dag_scheduler.stage.failed | Number of failed stages run by the DAGScheduler. | { stage } | Counter | |
✅ spark.driver.executor.gc.operations | Number of garbage collection operations performed by the driver. | { gc_operation } | Counter | gc_type |
✅ spark.driver.executor.gc.time | Total elapsed time during garbage collection operations performed by the driver. | ms | Counter | gc_type |
✅ spark.driver.executor.memory.execution | Amount of execution memory currently used by the driver. | bytes | UpDownCounter | location |
✅ spark.driver.executor.memory.jvm | Amount of memory used by the driver’s JVM. | bytes | UpDownCounter | location |
✅ spark.driver.executor.memory.pool | Amount of pool memory currently used by the driver. | bytes | UpDownCounter | pool_memory_type |
✅ spark.driver.executor.memory.storage | Amount of storage memory currently used by the driver. | bytes | UpDownCounter | location |
✅ spark.driver.hive_external_catalog.file_cache_hits | Number of file cache hits on the HiveExternalCatalog. | { hit } | Counter | |
✅ spark.driver.hive_external_catalog.files_discovered | Number of files discovered while listing the partitions of a table in the Hive metastore | { file } | Counter | |
✅ spark.driver.hive_external_catalog.hive_client_calls | Number of calls to the underlying Hive Metastore client made by the Spark application. | { call } | Counter | |
✅ spark.driver.hive_external_catalog.parallel_listing_jobs | Number of parallel listing jobs initiated by the HiveExternalCatalog when listing partitions of a table. | { listing_job } | Counter | |
✅ spark.driver.hive_external_catalog.partitions_fetched | Table partitions fetched by the HiveExternalCatalog. | { partition } | Counter | |
✅ spark.driver.jvm_cpu_time | Current CPU time taken by the Spark driver. | ns | Counter | |
✅ spark.driver.live_listener_bus.dropped | Number of events that have been dropped by the LiveListenerBus. | { event } | Counter | |
✅ spark.driver.live_listener_bus.posted | Number of events that have been posted on the LiveListenerBus. | { event } | Counter | |
✅ spark.driver.live_listener_bus.processing_time.average | Average time taken for the LiveListenerBus to process an event posted to it. | ms | Gauge | |
✅ spark.driver.live_listener_bus.queue_size | Number of events currently waiting to be processed by the LiveListenerBus. | { event } | UpDownCounter | |
✅ spark.executor.disk.usage | Disk space used by this executor for RDD storage. | bytes | UpDownCounter | |
✅ spark.executor.gc_time | Elapsed time the JVM spent in garbage collection in this executor. | ms | Counter | |
✅ spark.executor.input_size | Amount of data input for this executor. | bytes | Counter | |
✅ spark.executor.memory.usage | Storage memory used by this executor. | bytes | UpDownCounter | |
✅ spark.executor.shuffle.io.size | Amount of data written and read during shuffle operations for this executor. | bytes | Counter | direction |
✅ spark.executor.storage_memory.usage | The executor’s storage memory usage. | bytes | UpDownCounter | location, state |
✅ spark.executor.task.active | Number of tasks currently running in this executor. | { task } | UpDownCounter | |
✅ spark.executor.task.limit | Maximum number of tasks that can run concurrently in this executor. | { task } | UpDownCounter | |
✅ spark.executor.task.result | Number of tasks with a specific result in this executor. | { task } | Counter | executor_task_result |
✅ spark.executor.time | Elapsed time the JVM spent executing tasks in this executor. | ms | Counter | |
✅ spark.job.stage.active | Number of active stages in this job. | { stage } | UpDownCounter | |
✅ spark.job.stage.result | Number of stages with a specific result in this job. | { stage } | Counter | job_result |
✅ spark.job.task.active | Number of active tasks in this job. | { task } | UpDownCounter | |
✅ spark.job.task.result | Number of tasks with a specific result in this job. | { task } | Counter | job_result |
✅ spark.stage.disk.spilled | The amount of disk space used for storing portions of overly large data chunks that couldn’t fit in memory in this stage. | bytes | Counter | |
✅ spark.stage.executor.cpu_time | CPU time spent by the executor in this stage. | ns | Counter | |
✅ spark.stage.executor.run_time | Amount of time spent by the executor in this stage. | ms | Counter | |
✅ spark.stage.io.records | Number of records written and read in this stage. | { record } | Counter | direction |
✅ spark.stage.io.size | Amount of data written and read at this stage. | bytes | Counter | direction |
✅ spark.stage.jvm_gc_time | The amount of time the JVM spent on garbage collection in this stage. | ms | Counter | |
✅ spark.stage.memory.peak | Peak memory used by internal data structures created during shuffles, aggregations and joins in this stage. | bytes | Counter | |
✅ spark.stage.memory.spilled | The amount of memory moved to disk due to size constraints (spilled) in this stage. | bytes | Counter | |
✅ spark.stage.shuffle.blocks_fetched | Number of blocks fetched in shuffle operations in this stage. | { block } | Counter | source |
✅ spark.stage.shuffle.fetch_wait_time | Time spent in this stage waiting for remote shuffle blocks. | ms | Counter | |
✅ spark.stage.shuffle.io.disk | Amount of data read to disk in shuffle operations (sometimes required for large blocks, as opposed to the default behavior of reading into memory). | bytes | Counter | |
✅ spark.stage.shuffle.io.read.size | Amount of data read in shuffle operations in this stage. | bytes | Counter | source |
✅ spark.stage.shuffle.io.records | Number of records written or read in shuffle operations in this stage. | { record } | Counter | direction |
✅ spark.stage.shuffle.io.write.size | Amount of data written in shuffle operations in this stage. | bytes | Counter | |
✅ spark.stage.shuffle.write_time | Time spent blocking on writes to disk or buffer cache in this stage. | ns | Counter | |
✅ spark.stage.status | A one-hot encoding representing the status of this stage. | { status } | UpDownCounter | stage_active, stage_complete, stage_pending, stage_failed |
✅ spark.stage.task.active | Number of active tasks in this stage. | { task } | UpDownCounter | |
✅ spark.stage.task.result | Number of tasks with a specific result in this stage. | { task } | Counter | stage_task_result |
✅ spark.stage.task.result_size | The amount of data transmitted back to the driver by all the tasks in this stage. | bytes | Counter |
Attributes
| Attribute Name | Description | Type | Values |
|---|---|---|---|
direction | Whether the metric is in regards to input or output operations. | string | in, out |
result | The result of the executor tasks for which the metric was recorded. | string | completed, failed |
gc_type | The type of the garbage collection performed for the metric. | string | major, minor |
result | The result of the job stages or tasks for which the metric was recorded. | string | completed, failed, skipped |
location | The location of the memory for which the metric was recorded.. | string | on_heap, off_heap |
type | The type of pool memory for which the metric was recorded. | string | direct, mapped |
status | The status of the DAGScheduler stages for which the metric was recorded. | string | waiting, running |
source | The source from which data was fetched for the metric. | string | local, remote |
active | Whether the stage for which the metric was recorded is active. | bool | |
complete | Whether the stage for which the metric was recorded is complete. | bool | |
failed | Whether the stage for which the metric was recorded is failed. | bool | |
pending | Whether the stage for which the metric was recorded is pending. | bool | |
result | The result of the stage tasks for which the metric was recorded. | string | completed, failed, killed |
state | The state of the memory for which the metric was recorded. | string | used, free |
Resource Attributes
| Attribute Name | Description | Type | Enabled |
|---|---|---|---|
spark.application.id | The ID of the application for which the metric was recorded. | string | ✅ |
spark.application.name | The name of the application for which the metric was recorded. | string | ✅ |
spark.executor.id | The ID of the executor for which the metric was recorded. | string | ✅ |
spark.job.id | The ID of the job for which the metric was recorded. | int | ✅ |
spark.stage.attempt.id | The ID of the stage attempt for which the metric was recorded. | int | ❌ |
spark.stage.id | The ID of the application stage for which the metric was recorded. | int | ✅ |
Configuration
Example Configuration
Last generated: 2026-04-20