Dataset Exporter
contrib
Maintainers: @atoulme, @martin-majlis-s1, @zdaratom-s1, @tomaz-s1
Source: opentelemetry-collector-contrib
Supported Telemetry
Overview
This exporter sends logs to DataSet. See the Getting Started guide.Configuration
Required Settings
dataset_url(no default): The URL of the DataSet API that ingests the data. Most likely https://app.scalyr.com.api_key(no default): The βLog Writeβ API Key required to use API. Instructions how to get API key.
api_key in the file, you can use the builtin functionality and use api_key: ${env:DATASET_API_KEY}.
Server Host Settings
Specifying the server host is crucial for ensuring the correct functionality of DataSet. DataSet expects the server host value to be provided in theserverHost attribute.
If the server host value is stored in a different attribute, you can use the resourceprocessor or attributesprocessor to copy it into the serverHost attribute.
You can also utilize the server_host settings (described below) to populate the serverHost attribute with different values.
The process of populating the serverHost attribute works as follows:
- If the
serverHostattribute is specified and not empty in the log or trace, then it is used. - If the
serverHostattribute is specified and not empty in the resource, then it is used. - If the
host.nameattribute is specified and not empty in the resource, then it is used. - If the
server_host.server_hostsetting is specified and not empty, then it is used. - If
server_host.use_host_namesetting is set totrue, thehostnameof the node is used.
serverHost attribute to ensure the proper functionality of DataSet and accurate handling of events.
Optional Settings
debug(default = false): Addssession_keyto the server fields. Itβs useful for debugging throughput issues.buffer:max_lifetime(default = 5s): The maximum delay between sending batches from the same session.purge_older_than(default = 30s): The maximum delay between receiving data for the same session after which resources associated with it are purged.group_by(default = []): The list of attributes based on which events should be grouped. They are moved from the event attributes to the session info and shown as server fields in the UI.retry_initial_interval(default = 5s): Time to wait after the first failure before retrying.retry_max_interval(default = 30s): Is the upper bound on backoff.retry_max_elapsed_time(default = 300s): Is the maximum amount of time spent trying to send a buffer.retry_shutdown_timeout(default = 30s): The maximum time for which it will try to send data to the DataSet during shutdown. This value should be shorter than containerβs grace period.max_parallel_outgoing(default = 100): The maximum number of parallel outgoing requests.
logs:export_resource_info_on_event(default = false): Include LogRecord resource information (if available) on the DataSet event.export_resource_prefix(default = βresource.attributes.β): A prefix string for the resource, ifexport_resource_info_on_eventis enabled.export_scope_info_on_event(default = true): Include LogRecord scope information (if available) on the DataSet event.export_scope_prefix(default = βscope.attributes.β): A prefix string for the scope, ifexport_scope_info_on_eventis enabled.export_separator(default = β.β): The separator to add between keys when flattening nested structures (maps, arrays).export_distinguishing_suffix(default = β_β): A suffix string to resolve naming collisions when flattening.decompose_complex_message_field(default = false): Decompose complex body / message field types (e.g. a maps, arrays) into separate fields.decomposed_complex_message_prefix(default = βbody.map.β): A prefix string to use when a complex message is decomposed.
traces:export_separator(default = β.β): The separator to add between keys when flattening nested structures (maps, arrays).export_distinguishing_suffix(default = β_β): A suffix string to resolve naming collisions when flattening.
server_host:server_host(default = β): Specifies the server host to be used for the events.use_hostname(default = true): Determines whether thehostnameof the node should be used as the server host for the events. When set totrue, the nodeβshostnameis automatically used.
retry_on_failure: See retry_on_failuresending_queue: See sending_queuetimeout: See timeout
Attributes
Enabled attributes are exported in the order:- Log properties
- Body
- Resource attributes
- Scope attributes
- Log attributes
export_distinguishing_suffix value is appended to the later attributeβs name. If the export_distinguishing_suffix value is an empty string, then the value from the last attribute is used.
Example
Example LogRecord:- Default settings for
logs:- Event:
- Event:
- Everything enabled:
- Configuration:
- Event:
- Configuration:
- Everything enabled, prefixes are empty strings:
- Configuration:
- Event:
- Configuration:
- Everything enabled, prefixes are empty strings, suffix is empty string:
- Configuration:
- Event:
- Configuration:
. dots, _ underscores, and - hyphens. You must escape slashes in Search and PowerQueries. For example, search the field name app.kubernetes.io/component as app.kubernetes.io\/component.
Example
Handling serverHost Attribute
Based on the given configuration and scenarios, hereβs the expected behavior:
- Resource:
{'node_id:' 'node-pay-01', 'host.name': 'host-pay-01'}, Log:{'container_id': 'cont-pay-01'}, Env:SERVER_HOST='server-pay-01', Hostname:ip-172-31-27-19- Since the attribute
container_idis set,attributesprocessorwill copy this value to theserverHost. - Used
serverHostwill becont-pay-01.
- Since the attribute
- Resource:
{'node_id': 'node-pay-01', 'host.name': 'host-pay-01'}, Log:{'attribute.foo': 'Bar'}, Env:SERVER_HOST='server-pay-01', Hostname:ip-172-31-27-19- Since the resource attribute
node_idis set,resourceprocessorwill copy this value to theserverHost. - Used
serverHostwill benode-pay-01.
- Since the resource attribute
- Resource:
{'host.name': 'host-pay-01'}, Log:{'attribute.foo': 'Bar'}, Env:SERVER_HOST='server-pay-01', Hostname:ip-172-31-27-19- Since the resource attribute
host.nameis set, it will be used. - Used
serverHostwill behost-pay-01.
- Since the resource attribute
- Resource:
{}, Log:{'attribute.foo': 'Bar'}, Env:SERVER_HOST='server-pay-01', Hostname:ip-172-31-27-19- Since the attribute
container_idis not set, the value from the environmental variableSERVER_HOSTwill be copied to theserverHost. - Used
serverHostwill beserver-pay-01.
- Since the attribute
- Resource:
{}, Log:{'attribute.foo': 'Bar'}, Env:SERVER_HOST='', Hostname:ip-172-31-27-19- Since the attribute
container_idis not set and the environmental variableSERVER_HOSTis empty, thehostnameof the node (ip-172-31-27-19) will be used as the fallback value forserverHost. - Used
serverHostwill beip-172-31-27-19.
- Since the attribute
Metrics
To enable metrics you have to:- Run collector with enabled feature gate
telemetry.useOtelForInternalMetrics. This can be done by executing it with one additional parameter ---feature-gates=telemetry.useOtelForInternalMetrics. - Enable metrics scraping as part of the configuration and add receiver into services:
Available Metrics
Available metrics containdataset in their name. There are counters related to the
number of processed events (events), buffers (buffer), sessions (sessions), and transferred bytes (bytes).
There are also histograms related to response times (responseTime) and payload size (payloadSize).
There are several counters related to events/buffers:
enqueued- the number of received entitiesprocessed- the number of entities that were accepted by the next layerdropped- the number of entities that were not accepted by the next layerbroken- the number of entities that were somehow corrupted during processing (should be 0)
enqueued - (processed + dropped + broken).
Configuration
Example Configuration
Last generated: 2026-04-13