LinuxCommandLibrary

logstash

Process and forward logs to central location

TLDR

Check validity of a Logstash configuration

$ logstash --configtest --config [logstash_config.conf]
copy

Run Logstash using configuration
$ sudo logstash --config [logstash_config.conf]
copy

Run Logstash with the most basic inline configuration string
$ sudo logstash -e 'input {} filter {} output {}'
copy

SYNOPSIS

logstash [OPTIONS]
logstash -f CONFIG_PATH [OPTIONS]
logstash -e CONFIG_STRING [OPTIONS]

PARAMETERS

-f, --path.config CONFIG_PATH
    Specifies the path to the Logstash configuration file(s) or directory. Can be a single file, a comma-separated list of files, or a directory containing configuration files.

-e, --config.string CONFIG_STRING
    Executes the provided configuration string directly, without needing a separate file. Useful for quick tests or small configurations.

-r, --reload-every INTERVAL
    Enables automatic configuration reloading. Logstash will monitor the config file(s) for changes and reload them every INTERVAL seconds. (e.g., --reload-every 5s).

--path.data PATH
    Sets the path to the Logstash data directory, used for persistent queues, metrics, and other internal data. Defaults to data/ inside the Logstash home directory.

--path.settings PATH
    Specifies the path to the directory containing Logstash settings files (e.g., logstash.yml, jvm.options). Defaults to config/.

--log.level LEVEL
    Sets the logging level for Logstash. Common levels include fatal, error, warn, info, debug, trace. Defaults to info.

--config.test_and_exit, -t
    Tests the configuration for syntax errors and validity, then exits. Does not start the Logstash pipeline.

--allow-env
    Enables the use of environment variables directly within the configuration files using ${ENV_VAR} syntax.

-w, --pipeline.workers COUNT
    Specifies the number of worker threads to use for filter and output processing. Defaults to the number of CPU cores.

-b, --pipeline.batch.size SIZE
    Sets the maximum number of events an individual worker thread will pull from the input queue at once. Defaults to 125.

DESCRIPTION

Logstash is an open-source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite 'stash', often Elasticsearch.

It's a core component of the Elastic Stack (formerly known as the ELK Stack, comprising Elasticsearch, Logstash, and Kibana). Logstash uses a flexible plugin-based architecture, allowing it to collect various types of data (logs, metrics, web applications, data stores) through input plugins, process and enrich this data using filter plugins (e.g., parsing, transforming, adding context), and then output the processed data to different destinations via output plugins (e.g., Elasticsearch, Kafka, S3, files). Its primary use case is centralized log management and analytics.

CAVEATS

Logstash is built on JRuby and runs on the Java Virtual Machine (JVM), which can lead to significant memory and CPU consumption, especially for large volumes of data. Proper resource allocation and JVM tuning are crucial for production deployments. Configuration can become complex with multiple inputs, filters, and outputs, requiring careful design and testing. Debugging can be challenging, often relying heavily on log output and the --config.test_and_exit option.

TYPICAL USAGE

Logstash is commonly deployed as a daemon or service that continuously runs in the background, consuming data from various sources (e.g., Filebeat agents, message queues), processing it according to its configured pipelines, and then forwarding it to a central data store like Elasticsearch for indexing and analysis. It's a critical component for centralizing and enriching diverse data types.

CONFIGURATION STRUCTURE

A Logstash configuration file is divided into three main sections: input, filter, and output.

  • Input: Defines where Logstash gets its data (e.g., file, beats, http, kafka).
  • Filter: Processes and transforms the data (e.g., grok for parsing, mutate for modifying fields, geoip for adding location data).
  • Output: Specifies where the processed data should be sent (e.g., elasticsearch, stdout, file, s3).

HISTORY

Logstash was originally created by Jordan Sissel in 2009 as a flexible tool for managing event data. Its ability to collect, parse, and store logs efficiently quickly gained traction within the developer community. It became a foundational component of the Elastic Stack (ELK Stack) after being acquired by Elastic, the company behind Elasticsearch and Kibana. Over the years, its architecture has evolved, emphasizing modularity through plugins and robustness for enterprise-grade data pipelines.

SEE ALSO

elasticsearch(1), kibana(1), filebeat(1), metricbeat(1), systemd(1)

Copied to clipboard