Steps to trigger airflow DAG
In Airflow, a DAG
–Directed Acyclic Graph – is a collection of the tasks you want to run, organized in a way that reflects their relationships and dependencies.
A DAG is defined in a Python script, which represents the DAGs structure (tasks and their dependencies) as code.
Manual Trigger
1.Log onto the Punjab Prod server using the credentials:
URL: Sign In - Airflow
username: admin
password: admin
2. Trigger the DAG by clicking on the “Trigger DAG with Config” option.
3. Enter a date and click on the Trigger button
Format {“date”: “dd-MM-yyyy”}
4. Click on the Log option and expand the DAG to view the logs. Choose a stage for any module.
Logs can also be viewed in the Elastic search index adaptor_logs
GET adaptor_logs/_search - the timestamp is provided based on the day for which the logs are being searched.
This DAG triggers every day at midnight for the previous day.
Bulk Insert For A Date Range
Execute the script to run the DAG for a date range for the staging NDB
sh iterate_over_date.sh <start-date> <end-date> ex: sh iterate_over_date.sh 01-03-2022 05-03-2022
date needs to be in the format of dd-mm-YYYY
range exclusive of the last date, [start-date, end-date). For instance: in the above example, the script will trigger DAG on 1st, 2nd, 3rd and 4th March. It will not be triggered on 5th March.