1

airflow.cfg:

# airflow version = 1.10.1
executor = LocalExecutor
parallelism = 32
dag_concurrency = 16

And a dag.py:

with DAG('mydag', schedule_interval="@hourly") as dag:
    # define tasks

But sometimes my task takes longer to finish than my schedule_interval, and airflow schedules the next task before the previous is done.

This is causing all sorts of awful race conditions.

Is there a way I can explicitly prevent overlapping tasks from being scheduled? Even if that means skipping a run entirely?

Roman
  • 8,826
  • 10
  • 63
  • 103

1 Answers1

2

EDIT-1

updated as per comment by @Chengzhi

While above would likely solve your problems, if you wish to skip overlapping DagRuns entirely, use

cmaher
  • 5,100
  • 1
  • 22
  • 34
y2k-shubham
  • 10,183
  • 11
  • 55
  • 131
  • 2
    to add to @y2k-shubham point: max_active_runs_per_dag is the default for all dags, if you want to do it on specific dag, override it on DAG level (https://github.com/apache/airflow/blob/v1-10-stable/airflow/models/dag.py#L140-L143) , also if you don't care too much about each dag_run, set `catchup` to False can avoid backpressure of each trigger. – Chengzhi Jul 27 '19 at 15:30