LinuxCommandLibrary

luigi

TLDR

Run task

$ luigi --module [mymodule] [MyTask]
copy
Run with parameters
$ luigi --module [mymodule] [MyTask] --[param]=[value]
copy
Run local scheduler
$ luigid
copy
Run with workers
$ luigi --module [mymodule] [MyTask] --workers [4]
copy
Show task graph
$ luigi --module [mymodule] [MyTask] --local-scheduler
copy

SYNOPSIS

luigi [options] task [task-params]

DESCRIPTION

Luigi is a Python workflow engine for building complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, and failure handling.
Tasks define dependencies via requires(), and Luigi ensures tasks run in correct order.

PARAMETERS

--module name

Python module containing tasks.
--workers n
Number of workers.
--local-scheduler
Use local instead of central scheduler.
--scheduler-host host
Central scheduler host.
--log-level level
Logging level.

TASK EXAMPLE

$ import luigi

class MyTask(luigi.Task):
    param = luigi.Parameter()

    def requires(self):
        return UpstreamTask()

    def output(self):
        return luigi.LocalTarget('output.txt')

    def run(self):
        with self.output().open('w') as f:
            f.write(self.param)
copy

CAVEATS

Central scheduler recommended for production. No built-in triggering (use cron). Targets define completion. Python 3 required.

HISTORY

Luigi was developed at Spotify and open-sourced in 2012 for managing complex data pipelines and machine learning workflows.

SEE ALSO

airflow(1), prefect(1), dask(1), celery(1)

Copied to clipboard