Airflow#
Note
This is a quick reference. For a full tutorial, click here.
Step 1: Add target environment#
Tip
To get a sample pipeline to try this out, see this.
KubernetesPodOperator
#
# add a target environment named 'airflow' (uses KubernetesPodOperator)
soopervisor add airflow --backend airflow
Note
Using the --preset
option requires soopervisor>=0.7
# add a target environment named 'airflow-k8s' (uses KubernetesPodOperator)
soopervisor add airflow-k8s --backend airflow --preset kubernetes
BashOperator
#
Important
If using --preset bash
, the BashOperator
tasks will use ploomber
CLI to execute your pipeline. Edit the cwd
argument in BashOperator
so your DAG runs in a directory where it can import your project’s pipeline.yaml
and source code.
# add a target environment named 'airflow-bash' (uses BashOperator)
soopervisor add airflow-bash --backend airflow --preset bash
DockerOperator
#
Important
Due to a bug in the DockerOperator, we must set enable_xcom_pickling = True
in airflow.cfg
file. By default, this file is located at ~/airflow/airflow.cfg
.
# add a target environment named 'airflow-docker' (uses DockerOperator)
soopervisor add airflow-docker --backend airflow --preset docker
Step 2: Generate Airflow DAG#
# export target environment named 'airflow'
soopervisor export airflow
Important
For your pipeline to run successfully, tasks must write their outputs to a common location. You can do this either by creating a shared disk or by adding a storage client. Click here to learn more.