Apachespark Receiver

Available in: contrib Maintainers: @Caleb-Hurshman, @mrsillydog Source: opentelemetry-collector-contrib

Supported Telemetry

Overview

Purpose

The purpose of this component is to monitor Apache Spark clusters and the applications running on them through the collection of performance metrics like memory utilization, CPU utilization, shuffle operations, garbage collection time, I/O operations, and more.

Prerequisites

This receiver supports Apache Spark versions:

3.3.2+

Configuration

Note: This receiver was renamed from apachespark to apache_spark to match the snake_case naming convention. The deprecated component type apachespark is still accepted as an alias and will log a deprecation warning.

These configuration options are for connecting to an Apache Spark application. The following settings are optional:

collection_interval: (default = 60s): This receiver collects metrics on an interval. This value must be a string readable by Golang’s time.ParseDuration. Valid time units are ns, us (or µs), ms, s, m, h.
initial_delay (default = 1s): defines how long this receiver waits before starting.
endpoint: (default = http://localhost:4040): Apache Spark endpoint to connect to in the form of [http][://]{host}[:{port}]
application_names: An array of Spark application names for which metrics should be collected. If no application names are specified, metrics will be collected for all Spark applications running on the cluster at the specified endpoint.

Example Configuration

receivers:
  apache_spark:
    collection_interval: 60s
    endpoint: http://localhost:4040
    application_names:
    - PythonStatusAPIDemo
    - PythonLR

The full list of settings exposed for this receiver are documented in config.go with detailed sample configurations in testdata/config.yaml.

Metrics

Details about the metrics produced by this receiver can be found in metadata.yaml

Metrics

Metric Name	Description	Unit	Type	Attributes
✅ `spark.driver.block_manager.disk.usage`	Disk space used by the BlockManager.	mb	UpDownCounter
✅ `spark.driver.block_manager.memory.usage`	Memory usage for the driver’s BlockManager.	mb	UpDownCounter	location, state
✅ `spark.driver.code_generator.compilation.average_time`	Average time spent during CodeGenerator source code compilation operations.	ms	Gauge
✅ `spark.driver.code_generator.compilation.count`	Number of source code compilation operations performed by the CodeGenerator.	{ compilation }	Counter
✅ `spark.driver.code_generator.generated_class.average_size`	Average class size of the classes generated by the CodeGenerator.	bytes	Gauge
✅ `spark.driver.code_generator.generated_class.count`	Number of classes generated by the CodeGenerator.	{ class }	Counter
✅ `spark.driver.code_generator.generated_method.average_size`	Average method size of the classes generated by the CodeGenerator.	bytes	Gauge
✅ `spark.driver.code_generator.generated_method.count`	Number of methods generated by the CodeGenerator.	{ method }	Counter
✅ `spark.driver.code_generator.source_code.average_size`	Average size of the source code generated by a CodeGenerator code generation operation.	bytes	Gauge
✅ `spark.driver.code_generator.source_code.operations`	Number of source code generation operations performed by the CodeGenerator.	{ operation }	Counter
✅ `spark.driver.dag_scheduler.job.active`	Number of active jobs currently being processed by the DAGScheduler.	{ job }	UpDownCounter
✅ `spark.driver.dag_scheduler.job.count`	Number of jobs that have been submitted to the DAGScheduler.	{ job }	Counter
✅ `spark.driver.dag_scheduler.stage.count`	Number of stages the DAGScheduler is either running or needs to run.	{ stage }	UpDownCounter	scheduler_status
✅ `spark.driver.dag_scheduler.stage.failed`	Number of failed stages run by the DAGScheduler.	{ stage }	Counter
✅ `spark.driver.executor.gc.operations`	Number of garbage collection operations performed by the driver.	{ gc_operation }	Counter	gc_type
✅ `spark.driver.executor.gc.time`	Total elapsed time during garbage collection operations performed by the driver.	ms	Counter	gc_type
✅ `spark.driver.executor.memory.execution`	Amount of execution memory currently used by the driver.	bytes	UpDownCounter	location
✅ `spark.driver.executor.memory.jvm`	Amount of memory used by the driver’s JVM.	bytes	UpDownCounter	location
✅ `spark.driver.executor.memory.pool`	Amount of pool memory currently used by the driver.	bytes	UpDownCounter	pool_memory_type
✅ `spark.driver.executor.memory.storage`	Amount of storage memory currently used by the driver.	bytes	UpDownCounter	location
✅ `spark.driver.hive_external_catalog.file_cache_hits`	Number of file cache hits on the HiveExternalCatalog.	{ hit }	Counter
✅ `spark.driver.hive_external_catalog.files_discovered`	Number of files discovered while listing the partitions of a table in the Hive metastore	{ file }	Counter
✅ `spark.driver.hive_external_catalog.hive_client_calls`	Number of calls to the underlying Hive Metastore client made by the Spark application.	{ call }	Counter
✅ `spark.driver.hive_external_catalog.parallel_listing_jobs`	Number of parallel listing jobs initiated by the HiveExternalCatalog when listing partitions of a table.	{ listing_job }	Counter
✅ `spark.driver.hive_external_catalog.partitions_fetched`	Table partitions fetched by the HiveExternalCatalog.	{ partition }	Counter
✅ `spark.driver.jvm_cpu_time`	Current CPU time taken by the Spark driver.	ns	Counter
✅ `spark.driver.live_listener_bus.dropped`	Number of events that have been dropped by the LiveListenerBus.	{ event }	Counter
✅ `spark.driver.live_listener_bus.posted`	Number of events that have been posted on the LiveListenerBus.	{ event }	Counter
✅ `spark.driver.live_listener_bus.processing_time.average`	Average time taken for the LiveListenerBus to process an event posted to it.	ms	Gauge
✅ `spark.driver.live_listener_bus.queue_size`	Number of events currently waiting to be processed by the LiveListenerBus.	{ event }	UpDownCounter
✅ `spark.executor.disk.usage`	Disk space used by this executor for RDD storage.	bytes	UpDownCounter
✅ `spark.executor.gc_time`	Elapsed time the JVM spent in garbage collection in this executor.	ms	Counter
✅ `spark.executor.input_size`	Amount of data input for this executor.	bytes	Counter
✅ `spark.executor.memory.usage`	Storage memory used by this executor.	bytes	UpDownCounter
✅ `spark.executor.shuffle.io.size`	Amount of data written and read during shuffle operations for this executor.	bytes	Counter	direction
✅ `spark.executor.storage_memory.usage`	The executor’s storage memory usage.	bytes	UpDownCounter	location, state
✅ `spark.executor.task.active`	Number of tasks currently running in this executor.	{ task }	UpDownCounter
✅ `spark.executor.task.limit`	Maximum number of tasks that can run concurrently in this executor.	{ task }	UpDownCounter
✅ `spark.executor.task.result`	Number of tasks with a specific result in this executor.	{ task }	Counter	executor_task_result
✅ `spark.executor.time`	Elapsed time the JVM spent executing tasks in this executor.	ms	Counter
✅ `spark.job.stage.active`	Number of active stages in this job.	{ stage }	UpDownCounter
✅ `spark.job.stage.result`	Number of stages with a specific result in this job.	{ stage }	Counter	job_result
✅ `spark.job.task.active`	Number of active tasks in this job.	{ task }	UpDownCounter
✅ `spark.job.task.result`	Number of tasks with a specific result in this job.	{ task }	Counter	job_result
✅ `spark.stage.disk.spilled`	The amount of disk space used for storing portions of overly large data chunks that couldn’t fit in memory in this stage.	bytes	Counter
✅ `spark.stage.executor.cpu_time`	CPU time spent by the executor in this stage.	ns	Counter
✅ `spark.stage.executor.run_time`	Amount of time spent by the executor in this stage.	ms	Counter
✅ `spark.stage.io.records`	Number of records written and read in this stage.	{ record }	Counter	direction
✅ `spark.stage.io.size`	Amount of data written and read at this stage.	bytes	Counter	direction
✅ `spark.stage.jvm_gc_time`	The amount of time the JVM spent on garbage collection in this stage.	ms	Counter
✅ `spark.stage.memory.peak`	Peak memory used by internal data structures created during shuffles, aggregations and joins in this stage.	bytes	Counter
✅ `spark.stage.memory.spilled`	The amount of memory moved to disk due to size constraints (spilled) in this stage.	bytes	Counter
✅ `spark.stage.shuffle.blocks_fetched`	Number of blocks fetched in shuffle operations in this stage.	{ block }	Counter	source
✅ `spark.stage.shuffle.fetch_wait_time`	Time spent in this stage waiting for remote shuffle blocks.	ms	Counter
✅ `spark.stage.shuffle.io.disk`	Amount of data read to disk in shuffle operations (sometimes required for large blocks, as opposed to the default behavior of reading into memory).	bytes	Counter
✅ `spark.stage.shuffle.io.read.size`	Amount of data read in shuffle operations in this stage.	bytes	Counter	source
✅ `spark.stage.shuffle.io.records`	Number of records written or read in shuffle operations in this stage.	{ record }	Counter	direction
✅ `spark.stage.shuffle.io.write.size`	Amount of data written in shuffle operations in this stage.	bytes	Counter
✅ `spark.stage.shuffle.write_time`	Time spent blocking on writes to disk or buffer cache in this stage.	ns	Counter
✅ `spark.stage.status`	A one-hot encoding representing the status of this stage.	{ status }	UpDownCounter	stage_active, stage_complete, stage_pending, stage_failed
✅ `spark.stage.task.active`	Number of active tasks in this stage.	{ task }	UpDownCounter
✅ `spark.stage.task.result`	Number of tasks with a specific result in this stage.	{ task }	Counter	stage_task_result
✅ `spark.stage.task.result_size`	The amount of data transmitted back to the driver by all the tasks in this stage.	bytes	Counter

Attributes

Attribute Name	Description	Type	Values
`direction`	Whether the metric is in regards to input or output operations.	string	`in`, `out`
`result`	The result of the executor tasks for which the metric was recorded.	string	`completed`, `failed`
`gc_type`	The type of the garbage collection performed for the metric.	string	`major`, `minor`
`result`	The result of the job stages or tasks for which the metric was recorded.	string	`completed`, `failed`, `skipped`
`location`	The location of the memory for which the metric was recorded..	string	`on_heap`, `off_heap`
`type`	The type of pool memory for which the metric was recorded.	string	`direct`, `mapped`
`status`	The status of the DAGScheduler stages for which the metric was recorded.	string	`waiting`, `running`
`source`	The source from which data was fetched for the metric.	string	`local`, `remote`
`active`	Whether the stage for which the metric was recorded is active.	bool
`complete`	Whether the stage for which the metric was recorded is complete.	bool
`failed`	Whether the stage for which the metric was recorded is failed.	bool
`pending`	Whether the stage for which the metric was recorded is pending.	bool
`result`	The result of the stage tasks for which the metric was recorded.	string	`completed`, `failed`, `killed`
`state`	The state of the memory for which the metric was recorded.	string	`used`, `free`

Resource Attributes

Attribute Name	Description	Type	Enabled
`spark.application.id`	The ID of the application for which the metric was recorded.	string	✅
`spark.application.name`	The name of the application for which the metric was recorded.	string	✅
`spark.executor.id`	The ID of the executor for which the metric was recorded.	string	✅
`spark.job.id`	The ID of the job for which the metric was recorded.	int	✅
`spark.stage.attempt.id`	The ID of the stage attempt for which the metric was recorded.	int	❌
`spark.stage.id`	The ID of the application stage for which the metric was recorded.	int	✅

Configuration

Example Configuration

# ./bin/otelcontribcol_darwin_arm64 --config ./receiver/apachesparkreceiver/testdata/config.yaml
receivers:
  apache_spark:
    collection_interval: 15s
exporters:
  file:
    path: ./receiver/apachesparkreceiver/output/metrics.json

service:
  pipelines:
    metrics:
      receivers: [apache_spark]
      exporters: [file]

Last generated: 2026-07-06

​Apachespark Receiver

​Supported Telemetry

​Overview

​Purpose

​Prerequisites

​Configuration

​Example Configuration

​Metrics

​Metrics

​Attributes

​Resource Attributes

​Configuration

​Example Configuration

Apachespark Receiver

Supported Telemetry

Overview

Purpose

Prerequisites

Configuration

Example Configuration

Metrics

Metrics

Attributes

Resource Attributes

Configuration

Example Configuration