logstash
Process and forward logs to central location
TLDR
Check validity of a Logstash configuration
Run Logstash using configuration
Run Logstash with the most basic inline configuration string
SYNOPSIS
logstash [OPTIONS]
logstash -f CONFIG_PATH [OPTIONS]
logstash -e CONFIG_STRING [OPTIONS]
PARAMETERS
-f, --path.config CONFIG_PATH
Specifies the path to the Logstash configuration file(s) or directory. Can be a single file, a comma-separated list of files, or a directory containing configuration files.
-e, --config.string CONFIG_STRING
Executes the provided configuration string directly, without needing a separate file. Useful for quick tests or small configurations.
-r, --reload-every INTERVAL
Enables automatic configuration reloading. Logstash will monitor the config file(s) for changes and reload them every INTERVAL seconds. (e.g., --reload-every 5s).
--path.data PATH
Sets the path to the Logstash data directory, used for persistent queues, metrics, and other internal data. Defaults to data/ inside the Logstash home directory.
--path.settings PATH
Specifies the path to the directory containing Logstash settings files (e.g., logstash.yml, jvm.options). Defaults to config/.
--log.level LEVEL
Sets the logging level for Logstash. Common levels include fatal, error, warn, info, debug, trace. Defaults to info.
--config.test_and_exit, -t
Tests the configuration for syntax errors and validity, then exits. Does not start the Logstash pipeline.
--allow-env
Enables the use of environment variables directly within the configuration files using ${ENV_VAR} syntax.
-w, --pipeline.workers COUNT
Specifies the number of worker threads to use for filter and output processing. Defaults to the number of CPU cores.
-b, --pipeline.batch.size SIZE
Sets the maximum number of events an individual worker thread will pull from the input queue at once. Defaults to 125.
DESCRIPTION
Logstash is an open-source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite 'stash', often Elasticsearch.
It's a core component of the Elastic Stack (formerly known as the ELK Stack, comprising Elasticsearch, Logstash, and Kibana). Logstash uses a flexible plugin-based architecture, allowing it to collect various types of data (logs, metrics, web applications, data stores) through input plugins, process and enrich this data using filter plugins (e.g., parsing, transforming, adding context), and then output the processed data to different destinations via output plugins (e.g., Elasticsearch, Kafka, S3, files). Its primary use case is centralized log management and analytics.
CAVEATS
Logstash is built on JRuby and runs on the Java Virtual Machine (JVM), which can lead to significant memory and CPU consumption, especially for large volumes of data. Proper resource allocation and JVM tuning are crucial for production deployments. Configuration can become complex with multiple inputs, filters, and outputs, requiring careful design and testing. Debugging can be challenging, often relying heavily on log output and the --config.test_and_exit option.
TYPICAL USAGE
Logstash is commonly deployed as a daemon or service that continuously runs in the background, consuming data from various sources (e.g., Filebeat agents, message queues), processing it according to its configured pipelines, and then forwarding it to a central data store like Elasticsearch for indexing and analysis. It's a critical component for centralizing and enriching diverse data types.
CONFIGURATION STRUCTURE
A Logstash configuration file is divided into three main sections: input, filter, and output.
- Input: Defines where Logstash gets its data (e.g., file, beats, http, kafka).
- Filter: Processes and transforms the data (e.g., grok for parsing, mutate for modifying fields, geoip for adding location data).
- Output: Specifies where the processed data should be sent (e.g., elasticsearch, stdout, file, s3).
HISTORY
Logstash was originally created by Jordan Sissel in 2009 as a flexible tool for managing event data. Its ability to collect, parse, and store logs efficiently quickly gained traction within the developer community. It became a foundational component of the Elastic Stack (ELK Stack) after being acquired by Elastic, the company behind Elasticsearch and Kibana. Over the years, its architecture has evolved, emphasizing modularity through plugins and robustness for enterprise-grade data pipelines.
SEE ALSO
elasticsearch(1), kibana(1), filebeat(1), metricbeat(1), systemd(1)