Skip to main content

Otlpjsonfile Receiver

Status Available in: contrib Maintainers: @atoulme Source: opentelemetry-collector-contrib

Supported Telemetry

Logs Metrics Traces

Overview

The receiver will watch the directory and read files. If a file is updated or added, the receiver will read it in its entirety again. The data is serialized according to the OpenTelemetry Protocol File Exporter.

Getting Started

The following settings are required:
  • include: set a glob path of files to include in data collection
Example:
receivers:
  otlp_json_file:
    include:
      - "/var/log/*.log"
    exclude:
      - "/var/log/example.log"
[!NOTE] The deprecated component type otlpjsonfile (without the underscores) can still be used as an alias and will log a deprecation warning.

Configuration

FieldDefaultDescription
includerequiredA list of file glob patterns that match the file paths to be read.
exclude[]A list of file glob patterns to exclude from reading. This is applied against the paths matched by include.
exclude_older_thanExclude files whose modification time is older than the specified age.
start_atendAt startup, where to start reading logs from the file. Options are beginning or end.
multilineA multiline configuration block. See the File Log Receiver documentation for details.
force_flush_period500msTime since last time new data was found in the file, after which a partial log at the end of the file may be emitted.
encodingutf-8The encoding of the file being read. See the list of supported encodings below for available options.
preserve_leading_whitespacesfalseWhether to preserve leading whitespaces.
preserve_trailing_whitespacesfalseWhether to preserve trailing whitespaces.
include_file_nametrueWhether to add the file name as the attribute log.file.name.
include_file_pathfalseWhether to add the file path as the attribute log.file.path.
include_file_name_resolvedfalseWhether to add the file name after symlinks resolution as the attribute log.file.name_resolved.
include_file_path_resolvedfalseWhether to add the file path after symlinks resolution as the attribute log.file.path_resolved.
include_file_owner_namefalseWhether to add the file owner name as the attribute log.file.owner.name. Not supported for windows.
include_file_owner_group_namefalseWhether to add the file group name as the attribute log.file.owner.group.name. Not supported for windows.
include_file_permissionsfalseWhether to add the file permissions as the attribute log.file.permissions in 3-digit octal format (e.g., 755). Not supported for windows.
include_file_record_numberfalseWhether to add the record number in the file as the attribute log.file.record_number.
include_file_record_offsetfalseWhether to add the record offset in the file as the attribute log.file.record_offset.
poll_interval200msThe duration between filesystem polls.
fingerprint_size1000The number of bytes, read from the start of a file, used to uniquely identify it. Must be at least 16. Decreasing this value will trigger re-ingestion of files larger than the new fingerprint size.
initial_buffer_size16KiBThe initial size of the to read buffer for headers and logs, the buffer will be grown as necessary. Larger values may lead to unnecessary large buffer allocations, and smaller values may lead to lots of copies while growing the buffer.
max_log_size1MiBThe maximum size of a log entry to read. The behavior for oversized log entries is controlled by max_log_size_behavior. Protects against reading large amounts of data into memory.
max_log_size_behaviorsplitBehavior when a log entry exceeds max_log_size. Options are split (default) which splits oversized entries into multiple log entries, or truncate which truncates the entry and drops the remainder.
max_concurrent_files1024The maximum number of log files from which logs will be read concurrently. If the number of files matched in the include pattern exceeds this number, then files will be processed in batches.
max_batches0Only applicable when files must be batched in order to respect max_concurrent_files. This value limits the number of batches that will be processed during a single poll interval. A value of 0 indicates no limit.
delete_after_readfalseIf true, each log file will be read and then immediately deleted. Requires that the filelog.allowFileDeletion feature gate is enabled. Must be false when start_at is set to end.
acquire_fs_lockfalseWhether to attempt to acquire a filesystem lock before reading a file (Unix only).
file_cache_advisefalseHints the operating system to release cached file pages after they are read, helping reduce page cache usage for large sequential workloads. (Linux only).
storagenoneThe ID of a storage extension to be used to store file offsets. File offsets allow the receiver to pick up where it left off in the case of a collector restart. If no storage extension is used, the receiver will manage offsets in memory only.
headernilSpecifies options for parsing header metadata. Requires that the filelog.allowHeaderMetadataParsing feature gate is enabled. Must not be set when start_at is set to end. Note: because this receiver does not run a stanza pipeline, attributes produced by header.metadata_operators are not propagated to emitted records; the header block is accepted only to allow header lines to be consumed without being parsed as OTLP JSON.
header.patternrequired for header metadata parsingA regex that matches every header line.
header.metadata_operatorsrequired for header metadata parsingA list of operators used to parse metadata from the header.
ordering_criteria.regexRegular expression used for sorting, should contain a named capture groups that are to be used in regex_key.
ordering_criteria.group_byRegular expression used for grouping, which is done pre-sorting. Should contain a named capture groups.
ordering_criteria.top_n1The number of files to track when using file ordering. The top N files are tracked after applying the ordering criteria.
ordering_criteria.sort_by.regex_keyRegular expression named capture group defined in ordering_criteria.regex to use for sorting.
ordering_criteria.sort_by.sort_typeType of sorting to be performed (e.g., numeric, alphabetical, timestamp, mtime)
ordering_criteria.sort_by.locationRelevant if sort_type is set to timestamp. Defines the location of the timestamp of the file.
ordering_criteria.sort_by.formatRelevant if sort_type is set to timestamp. Defines the strptime format of the timestamp being sorted.
ordering_criteria.sort_by.ascendingSort direction
compressionIndicate the compression format of input files. If set accordingly, files will be read using a reader that uncompresses the file before scanning its content. Options are “, gzip, or auto. auto auto-detects file compression type. Currently, gzip files are the only compressed files auto-detected, based on its headers See RFC 1952. auto option is useful when ingesting a mix of compressed and uncompressed files with the same receiver.
polls_to_archive0This setting controls the number of poll cycles to store on disk, rather than being discarded. By default, the receiver will purge the record of readers that have existed for 3 generations. Refer archiving in the File Log Receiver documentation and polling for more details. Note: This feature is experimental.
on_truncateignoreBehavior when a file with the same fingerprint is detected but with a smaller size (indicating a copytruncate rotation). Options are ignore, read_whole_file, or read_new. See handling copytruncate rotation.
replay_filefalseIf true, the receiver will not track file offsets and will re-read files from the beginning on every poll.

Supported encodings

KeyDescription
nopNo encoding validation. Treats the file as a stream of raw bytes
utf-8UTF-8 encoding
utf-8-rawUTF-8 encoding without replacing invalid UTF-8 bytes
utf-16leUTF-16 encoding with little-endian byte order
utf-16beUTF-16 encoding with big-endian byte order
asciiASCII encoding
big5The Big5 Chinese character encoding
Other less common encodings are supported on a best-effort basis. See https://www.iana.org/assignments/character-sets/character-sets.xhtml for other encodings available.

Time parameters

All time parameters must have the unit of time specified. e.g.: 200ms, 1s, 1m.

Handling Copytruncate Rotation

When log files are rotated using the copytruncate strategy (where the file is copied and then truncated in place), the receiver can detect when a file has been truncated by comparing the stored offset with the current file size. The on_truncate setting controls how the receiver behaves when truncation is detected:
  • ignore (default): The receiver keeps the original offset and will not read any data until the file grows past the original offset. This prevents duplicate log ingestion when a file is rotated.
  • read_whole_file: The receiver resets the offset to 0 and reads the entire file from the beginning. Use this mode when you want to ensure no data loss, even if it means potentially re-reading some logs.
  • read_new: The receiver updates the offset to the current file size (the position after truncation). This allows reading new data that is written after the truncation without re-reading existing content.
Example configuration:
receivers:
  otlp_json_file:
    include:
      - /var/log/otlp/*.json
    on_truncate: read_whole_file  # Read entire file after copytruncate rotation

Configuration

Example Configuration

otlp_json_file:
  include:
    - "/var/log/*.log"
  exclude:
    - "/var/log/example.log"
otlp_json_file/all:
  include_file_name: true
  include_file_path: true
  include_file_name_resolved: true
  include_file_path_resolved: true
  start_at: "beginning"
  fingerprint_size: 32768
  max_log_size: 10000
  max_concurrent_files: 4
  encoding: "UTF-8"
  on_truncate: "ignore"
  multiline:
    line_start_pattern: "<"
    line_end_pattern: ">"
  include:
    - "/var/log/*.log"
    - "/tmp/*.log"
  exclude:
    - "/var/log/example.log"

Last generated: 2026-06-01