luigi
TLDR
Run task
$ luigi --module [mymodule] [MyTask]
Run with parameters$ luigi --module [mymodule] [MyTask] --[param]=[value]
Run local scheduler$ luigid
Run with workers$ luigi --module [mymodule] [MyTask] --workers [4]
Show task graph$ luigi --module [mymodule] [MyTask] --local-scheduler
SYNOPSIS
luigi [options] task [task-params]
DESCRIPTION
Luigi is a Python workflow engine for building complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, and failure handling.
Tasks define dependencies via requires(), and Luigi ensures tasks run in correct order.
PARAMETERS
--module name
Python module containing tasks.--workers n
Number of workers.--local-scheduler
Use local instead of central scheduler.--scheduler-host host
Central scheduler host.--log-level level
Logging level.
TASK EXAMPLE
$ import luigi
class MyTask(luigi.Task):
param = luigi.Parameter()
def requires(self):
return UpstreamTask()
def output(self):
return luigi.LocalTarget('output.txt')
def run(self):
with self.output().open('w') as f:
f.write(self.param)
class MyTask(luigi.Task):
param = luigi.Parameter()
def requires(self):
return UpstreamTask()
def output(self):
return luigi.LocalTarget('output.txt')
def run(self):
with self.output().open('w') as f:
f.write(self.param)
CAVEATS
Central scheduler recommended for production. No built-in triggering (use cron). Targets define completion. Python 3 required.
HISTORY
Luigi was developed at Spotify and open-sourced in 2012 for managing complex data pipelines and machine learning workflows.


