Airflow#

Note

This is a quick reference. For a full tutorial, click here.

Step 1: Add target environment#

Tip

To get a sample pipeline to try this out, see this.

KubernetesPodOperator#

# add a target environment named 'airflow' (uses KubernetesPodOperator)
soopervisor add airflow --backend airflow

Note

Using the --preset option requires soopervisor>=0.7

# add a target environment named 'airflow-k8s' (uses KubernetesPodOperator)
soopervisor add airflow-k8s --backend airflow --preset kubernetes

BashOperator#

Important

If using --preset bash, the BashOperator tasks will use ploomber CLI to execute your pipeline. Edit the cwd argument in BashOperator so your DAG runs in a directory where it can import your project’s pipeline.yaml and source code.

# add a target environment named 'airflow-bash' (uses BashOperator)
soopervisor add airflow-bash --backend airflow --preset bash

DockerOperator#

Important

Due to a bug in the DockerOperator, we must set enable_xcom_pickling = True in airflow.cfg file. By default, this file is located at ~/airflow/airflow.cfg.

# add a target environment named 'airflow-docker' (uses DockerOperator)
soopervisor add airflow-docker --backend airflow --preset docker

Step 2: Generate Airflow DAG#

# export target environment named 'airflow'
soopervisor export airflow

Important

For your pipeline to run successfully, tasks must write their outputs to a common location. You can do this either by creating a shared disk or by adding a storage client. Click here to learn more.