Note: This page covers only the documentation for collecting Airflow integration metrics and logs using the Datadog Agent.
The Datadog Agent collects many metrics from Airflow, including those for:
- DAGs (Directed Acyclic Graphs): Number of DAG processes, DAG bag size, etc.
- Tasks: Task failures, successes, killed, etc.
- Pools: Open slots, used slots, etc.
- Executors: Open slots, queued tasks, running tasks, etc.
Metrics are collected through the Airflow StatsD plugin and sent to Datadog's DogStatsD.
In addition to metrics, the Datadog Agent also sends service checks related to Airflow's health.
Minimum Agent version: 7.17.0
All steps below are needed for the Airflow integration to work properly. Before you begin, install the Datadog Agent version >=6.17 or >=7.17, which includes the StatsD/DogStatsD mapping feature.
There are two parts of the Airflow integration:
- The Datadog Agent portion, which makes requests to a provided endpoint for Airflow to report whether it can connect and is healthy. The Agent integration also queries Airflow to produce some of its own metrics. Support for Airflow V1 and V2.
- The Airflow StatsD portion, where Airflow can be configured to send metrics to the Datadog Agent, which can remap the Airflow notation to a Datadog notation. Support for Airflow V1, V2, and V3.
The Airflow integration's metrics come from both the Agent and StatsD portions.
Note: The Datadog Agent's airflow integration does not support Airflow V3.
Configure the Agent's airflow check included in the Datadog Agent package to collect health metrics and service checks. This can be done by editing the url within the airflow.d/conf.yaml file, in the conf.d/ folder at the root of your Agent's configuration directory, to start collecting your Airflow service checks. See the sample airflow.d/conf.yaml for all available configuration options.
Ensure that url matches your Airflow webserver base_url, the URL used to connect to your Airflow instance.
Connect Airflow to DogStatsD (included in the Datadog Agent) by using the Airflow statsd feature to collect metrics. For more information about the metrics reported by the Airflow version used and the additional configuration options, see the Airflow documentation below:
Note: Presence or absence of StatsD metrics reported by Airflow might vary depending on the Airflow Executor used. For example: airflow.ti_failures/successes, airflow.operator_failures/successes, airflow.dag.task.duration are not reported for KubernetesExecutor.
-
Install the Airflow StatsD plugin.
pip install 'apache-airflow[statsd]' -
Update the Airflow configuration file
airflow.cfgby adding the following configs:[metrics] statsd_on = True # Hostname or IP of server running the Datadog Agent statsd_host = localhost # DogStatsD port configured in the Datadog Agent statsd_port = 8125 statsd_prefix = airflow
Do not set
statsd_datadog_enabledwithout first installing the Datadog DogStatsD package. -
Update the Datadog Agent main configuration file
datadog.yamlby adding the following configuration to remap the Airflow notation to Datadog notation:# dogstatsd_mapper_cache_size: 1000 # default to 1000 dogstatsd_mapper_profiles: - name: airflow prefix: "airflow." mappings: - match: "airflow.*_start" name: "airflow.job.start" tags: job_name: "$1" - match: "airflow.*_end" name: "airflow.job.end" tags: job_name: "$1" - match: "airflow.*_heartbeat_failure" name: airflow.job.heartbeat.failure tags: job_name: "$1" - match: "airflow.operator_failures_*" name: "airflow.operator_failures" tags: operator_name: "$1" - match: "airflow.operator_successes_*" name: "airflow.operator_successes" tags: operator_name: "$1" - match: 'airflow\.dag_processing\.last_runtime\.(.*)' match_type: "regex" name: "airflow.dag_processing.last_runtime" tags: dag_file: "$1" - match: 'airflow\.dag_processing\.last_run\.seconds_ago\.(.*)' match_type: "regex" name: "airflow.dag_processing.last_run.seconds_ago" tags: dag_file: "$1" - match: 'airflow\.dag\.loading-duration\.(.*)' match_type: "regex" name: "airflow.dag.loading_duration" tags: dag_file: "$1" - match: "airflow.local_task_job.task_exit.*.*.*.*" name: "airflow.local_task_job.task_exit" tags: job_id: "$1" dag_id: "$2" task_id: "$3" return_code: "$4" - match: "airflow.dag.*.*.queue_duration" name: "airflow.dag.queue_duration" tags: dag_id: "$1" task_id: "$2" - match: "airflow.dag.*.*.queued_duration" name: "airflow.dag.queued_duration" tags: dag_id: "$1" task_id: "$2" - match: "airflow.dag.*.*.scheduled_duration" name: "airflow.dag.scheduled_duration" tags: dag_id: "$1" task_id: "$2" - match: "airflow.dagrun.*.first_task_scheduling_delay" name: "airflow.dagrun.first_task_scheduling_delay" tags: dag_id: "$1" - match: "airflow.pool.open_slots.*" name: "airflow.pool.open_slots" tags: pool_name: "$1" - match: "airflow.pool.queued_slots.*" name: "airflow.pool.queued_slots" tags: pool_name: "$1" - match: "airflow.pool.running_slots.*" name: "airflow.pool.running_slots" tags: pool_name: "$1" - match: "airflow.pool.used_slots.*" name: "airflow.pool.used_slots" tags: pool_name: "$1" - match: "airflow.pool.starving_tasks.*" name: "airflow.pool.starving_tasks" tags: pool_name: "$1" - match: "airflow.pool.deferred_slots.*" name: "airflow.pool.deferred_slots" tags: pool_name: "$1" - match: "airflow.pool.scheduled_slots.*" name: "airflow.pool.scheduled_slots" tags: pool_name: "$1" - match: 'airflow\.dagrun\.dependency-check\.(.*)' match_type: "regex" name: "airflow.dagrun.dependency_check" tags: dag_id: "$1" - match: 'airflow\.dag\.(.*)\.([^.]*)\.duration' match_type: "regex" name: "airflow.dag.task.duration" tags: dag_id: "$1" task_id: "$2" - match: 'airflow\.task\.cpu_usage\.(.*)\.(.*)' match_type: "regex" name: "airflow.task.cpu_usage" tags: dag_id: "$1" task_id: "$2" - match: 'airflow\.task\.mem_usage\.(.*)\.(.*)' match_type: "regex" name: "airflow.task.mem_usage" tags: dag_id: "$1" task_id: "$2" - match: "airflow.task.duration" name: "airflow.task.duration" - match: "airflow.task.queued_duration" name: "airflow.task.queued_duration" - match: "airflow.task.scheduled_duration" name: "airflow.task.scheduled_duration" - match: 'airflow\.dag_processing\.last_duration\.(.*)' match_type: "regex" name: "airflow.dag_processing.last_duration" tags: dag_file: "$1" - match: 'airflow\.dagrun\.duration\.success\.(.*)' match_type: "regex" name: "airflow.dagrun.duration.success" tags: dag_id: "$1" - match: 'airflow\.dagrun\.duration\.failed\.(.*)' match_type: "regex" name: "airflow.dagrun.duration.failed" tags: dag_id: "$1" - match: 'airflow\.dagrun\.schedule_delay\.(.*)' match_type: "regex" name: "airflow.dagrun.schedule_delay" tags: dag_id: "$1" - match: "airflow.dagrun.dependency-check" name: "airflow.dagrun.dependency_check" - match: "airflow.dagrun.duration.success" name: "airflow.dagrun.duration.success" - match: "airflow.dagrun.duration.failed" name: "airflow.dagrun.duration.failed" - match: "airflow.dagrun.schedule_delay" name: "airflow.dagrun.schedule_delay" - match: "airflow.dagrun.first_task_scheduling_delay" name: "airflow.dagrun.first_task_scheduling_delay" - match: "airflow.scheduler.tasks.running" name: "airflow.scheduler.tasks.running" - match: "airflow.scheduler.tasks.starving" name: "airflow.scheduler.tasks.starving" - match: "airflow.sla_email_notification_failure" name: "airflow.sla_email_notification_failure" - match: "airflow.sla_missed" name: "airflow.sla_missed" - match: "airflow.sla_callback_notification_failure" name: "airflow.sla_callback_notification_failure" - match: "airflow.scheduler.critical_section_query_duration" name: "airflow.scheduler.critical_section_query_duration" - match: "airflow.scheduler.scheduler_loop_duration" name: "airflow.scheduler.scheduler_loop_duration" - match: 'airflow\.task_removed_from_dag\.(.*)' match_type: "regex" name: "airflow.dag.task_removed" tags: dag_id: "$1" - match: 'airflow\.task_restored_to_dag\.(.*)' match_type: "regex" name: "airflow.dag.task_restored" tags: dag_id: "$1" - match: "airflow.task_instance_created-*" name: "airflow.task.instance_created" tags: task_class: "$1" - match: "airflow.task_instance_created" name: "airflow.task.instance_created" - match: "airflow.task_instance_created_*" name: "airflow.task.instance_created" tags: operator_name: "$1" - match: "airflow.task_removed_from_dag" name: "airflow.dag.task_removed" - match: 'airflow\.ti\.start\.(.+)\.(\w+)' match_type: regex name: airflow.ti.start tags: dag_id: "$1" task_id: "$2" - match: 'airflow\.ti\.finish\.(\w+)\.(.+)\.(\w+)' name: airflow.ti.finish match_type: regex tags: dag_id: "$1" task_id: "$2" state: "$3" - match: "airflow.ti.start" name: "airflow.ti.start" - match: "airflow.ti.finish" name: "airflow.ti.finish" - match: "airflow.celery.execute_command.failure" name: "airflow.celery.execute_command.failure" - match: "airflow.triggerer_heartbeat" name: "airflow.triggerer_heartbeat" - match: "airflow.triggers.blocked_main_thread" name: "airflow.triggers.blocked_main_thread" - match: "airflow.triggers.failed" name: "airflow.triggers.failed" - match: "airflow.triggers.succeeded" name: "airflow.triggers.succeeded" - match: "airflow.triggers.running" name: "airflow.triggers.running" - match: 'airflow\.triggers\.running\.(.*)' match_type: "regex" name: "airflow.triggers.running" tags: hostname: "$1" - match: "airflow.dataset.updates" name: "airflow.dataset.updates" - match: "airflow.dataset.orphaned" name: "airflow.dataset.orphaned" - match: "airflow.dataset.triggered_dagruns" name: "airflow.dataset.triggered_dagruns" - match: "airflow.executor.open_slots" name: "airflow.executor.open_slots" - match: 'airflow\.executor\.open_slots\.(.*)' match_type: "regex" name: "airflow.executor.open_slots" tags: executor_class_name: "$1" - match: "airflow.executor.queued_tasks" name: "airflow.executor.queued_tasks" - match: 'airflow\.executor\.queued_tasks\.(.*)' match_type: "regex" name: "airflow.executor.queued_tasks" tags: executor_class_name: "$1" - match: "airflow.executor.running_tasks" name: "airflow.executor.running_tasks" - match: 'airflow\.executor\.running_tasks\.(.*)' match_type: "regex" name: "airflow.executor.running_tasks" tags: executor_class_name: "$1" - match: "airflow.kubernetes_executor.adopt_task_instances.duration" name: "airflow.kubernetes_executor.adopt_task_instances.duration" - match: "airflow.kubernetes_executor.clear_not_launched_queued_tasks.duration" name: "airflow.kubernetes_executor.clear_not_launched_queued_tasks.duration" - match: "airflow.dag_processing.file_path_queue_size" name: "airflow.dag_processing.file_path_queue_size" - match: "airflow.dag_processing.file_path_queue_update_count" name: "airflow.dag_processing.file_path_queue_update_count" - match: 'airflow\.dag_processing\.last_num_of_db_queries\.(.*)' match_type: "regex" name: "airflow.dag_processing.last_num_of_db_queries" tags: dag_file: "$1" - match: "airflow.dag_processing.other_callback_count" name: "airflow.dag_processing.other_callback_count" - match: "airflow.dag_processing.sla_callback_count" name: "airflow.dag_processing.sla_callback_count" - match: "airflow.dag_file_processor_timeouts" name: "airflow.dag_file_processor_timeouts"
- Restart the Agent.
- Restart Airflow to start sending your Airflow metrics to the Agent DogStatsD endpoint.
Use the default configuration in your airflow.d/conf.yaml file to activate your Airflow service checks. See the sample airflow.d/conf.yaml for all available configuration options.
Available for Agent versions >6.0
-
Collecting logs is disabled by default in the Datadog Agent. Enable it in your
datadog.yamlfile:logs_enabled: true
-
Uncomment and edit this configuration block at the bottom of your
airflow.d/conf.yaml: Change thepathandserviceparameter values and configure them for your environment.-
Configuration for DAG processor manager and Scheduler logs:
logs: - type: file path: "<PATH_TO_AIRFLOW>/logs/dag_processor_manager/dag_processor_manager.log" source: airflow log_processing_rules: - type: multi_line name: new_log_start_with_date pattern: \[\d{4}\-\d{2}\-\d{2} - type: file path: "<PATH_TO_AIRFLOW>/logs/scheduler/latest/*.log" source: airflow log_processing_rules: - type: multi_line name: new_log_start_with_date pattern: \[\d{4}\-\d{2}\-\d{2}
Regular clean up is recommended for scheduler logs with daily log rotation.
-
Additional configuration for DAG tasks logs:
logs: - type: file path: "<PATH_TO_AIRFLOW>/logs/*/*/*/*.log" source: airflow log_processing_rules: - type: multi_line name: new_log_start_with_date pattern: \[\d{4}\-\d{2}\-\d{2}
Caveat: By default Airflow uses this log file template for tasks:
log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log. The number of log files grow quickly if not cleaned regularly. This pattern is used by Airflow UI to display logs individually for each executed task.If you do not view logs in Airflow UI, Datadog recommends this configuration in
airflow.cfg:log_filename_template = dag_tasks.log. Then log rotate this file and use this configuration:logs: - type: file path: "<PATH_TO_AIRFLOW>/logs/dag_tasks.log" source: airflow log_processing_rules: - type: multi_line name: new_log_start_with_date pattern: \[\d{4}\-\d{2}\-\d{2}
-
Note: The Datadog Agent's airflow integration does not support Airflow V3.
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
| Parameter | Value |
|---|---|
<INTEGRATION_NAME> |
airflow |
<INIT_CONFIG> |
blank or {} |
<INSTANCE_CONFIG> |
{"url": "http://%%host%%:8080"} |
Ensure that url matches your Airflow webserver base_url, the URL used to connect to your Airflow instance. Replace localhost with the template variable %%host%%.
If you are using the official Airflow Helm chart, this should be applied on the webserver pod and its webserver container. For example, with the webserver.podAnnotations, your Autodiscovery Annotations may look like the following:
webserver:
podAnnotations:
ad.datadoghq.com/webserver.checks: |
{
"airflow": {
"instances": [
{
"url": "http://%%host%%:8080"
}
]
}
}Adjust the ad.datadoghq.com/<CONTAINER_NAME>.checks annotation accordingly if your container name differs.
Connect Airflow to DogStatsD (included in the Datadog Agent) by using the Airflow statsd feature to collect metrics. For more information about the metrics reported by the Airflow version used and the additional configuration options, see the Airflow documentation below:
Note: Presence or absence of StatsD metrics reported by Airflow might vary depending on the Airflow Executor used. For example: airflow.ti_failures/successes, airflow.operator_failures/successes, airflow.dag.task.duration are not reported for KubernetesExecutor.
The Airflow StatsD configuration can be enabled with the following environment variables with the Airflow Helm Chart:
env:
- name: AIRFLOW__METRICS__STATSD_ON
value: "True"
- name: AIRFLOW__METRICS__STATSD_PORT
value: "8125"
- name: AIRFLOW__METRICS__STATSD_PREFIX
value: "airflow"
extraEnv: |
- name: AIRFLOW__METRICS__STATSD_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIPNote: The Airflow Helm Chart requires the valueFrom based environment variables to be set with extraEnv. Do not set AIRFLOW__METRICS__STATSD_DATADOG_ENABLED without first installing the Datadog package.
The environment variable for the metrics endpoint AIRFLOW__METRICS__STATSD_HOST is supplied with the node's host IP address to route the StatsD data to the Datadog Agent pod on the same node as the Airflow pod. This setup also requires the Agent to have a hostPort open for this port 8125 and accepting non-local StatsD traffic. For more information, see DogStatsD on Kubernetes Setup. This should direct the StatsD traffic from the Airflow container to a Datadog Agent ready to accept the incoming data.
You must also update the Datadog Agent with the corresponding dogstatsd_mapper_profiles. To do this, copy the dogstatsd_mapper_profiles provided in the Host installation into your datadog.yaml file. Alternatively, you can also deploy your Datadog Agent with the equivalent JSON configuration in the environment variable DD_DOGSTATSD_MAPPER_PROFILES. For Kubernetes, use the complete configuration below:
env:
- name: DD_DOGSTATSD_MAPPER_PROFILES
value: >
[{"name":"airflow","prefix":"airflow.","mappings":[{"match":"airflow.*_start","name":"airflow.job.start","tags":{"job_name":"$1"}},{"match":"airflow.*_end","name":"airflow.job.end","tags":{"job_name":"$1"}},{"match":"airflow.*_heartbeat_failure","name":"airflow.job.heartbeat.failure","tags":{"job_name":"$1"}},{"match":"airflow.operator_failures_*","name":"airflow.operator_failures","tags":{"operator_name":"$1"}},{"match":"airflow.operator_successes_*","name":"airflow.operator_successes","tags":{"operator_name":"$1"}},{"match":"airflow\\.dag_processing\\.last_runtime\\.(.*)","match_type":"regex","name":"airflow.dag_processing.last_runtime","tags":{"dag_file":"$1"}},{"match":"airflow\\.dag_processing\\.last_run\\.seconds_ago\\.(.*)","match_type":"regex","name":"airflow.dag_processing.last_run.seconds_ago","tags":{"dag_file":"$1"}},{"match":"airflow\\.dag\\.loading-duration\\.(.*)","match_type":"regex","name":"airflow.dag.loading_duration","tags":{"dag_file":"$1"}},{"match":"airflow.local_task_job.task_exit.*.*.*.*","name":"airflow.local_task_job.task_exit","tags":{"job_id":"$1","dag_id":"$2","task_id":"$3","return_code":"$4"}},{"match":"airflow.dag.*.*.queue_duration","name":"airflow.dag.queue_duration","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow.dag.*.*.queued_duration","name":"airflow.dag.queued_duration","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow.dag.*.*.scheduled_duration","name":"airflow.dag.scheduled_duration","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow.dagrun.*.first_task_scheduling_delay","name":"airflow.dagrun.first_task_scheduling_delay","tags":{"dag_id":"$1"}},{"match":"airflow.pool.open_slots.*","name":"airflow.pool.open_slots","tags":{"pool_name":"$1"}},{"match":"airflow.pool.queued_slots.*","name":"airflow.pool.queued_slots","tags":{"pool_name":"$1"}},{"match":"airflow.pool.running_slots.*","name":"airflow.pool.running_slots","tags":{"pool_name":"$1"}},{"match":"airflow.pool.used_slots.*","name":"airflow.pool.used_slots","tags":{"pool_name":"$1"}},{"match":"airflow.pool.starving_tasks.*","name":"airflow.pool.starving_tasks","tags":{"pool_name":"$1"}},{"match":"airflow.pool.deferred_slots.*","name":"airflow.pool.deferred_slots","tags":{"pool_name":"$1"}},{"match":"airflow.pool.scheduled_slots.*","name":"airflow.pool.scheduled_slots","tags":{"pool_name":"$1"}},{"match":"airflow\\.dagrun\\.dependency-check\\.(.*)","match_type":"regex","name":"airflow.dagrun.dependency_check","tags":{"dag_id":"$1"}},{"match":"airflow\\.dag\\.(.*)\\.([^.]*)\\.duration","match_type":"regex","name":"airflow.dag.task.duration","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow\\.task\\.cpu_usage\\.(.*)\\.(.*)","match_type":"regex","name":"airflow.task.cpu_usage","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow\\.task\\.mem_usage\\.(.*)\\.(.*)","match_type":"regex","name":"airflow.task.mem_usage","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow.task.duration","name":"airflow.task.duration"},{"match":"airflow.task.queued_duration","name":"airflow.task.queued_duration"},{"match":"airflow.task.scheduled_duration","name":"airflow.task.scheduled_duration"},{"match":"airflow\\.dag_processing\\.last_duration\\.(.*)","match_type":"regex","name":"airflow.dag_processing.last_duration","tags":{"dag_file":"$1"}},{"match":"airflow\\.dagrun\\.duration\\.success\\.(.*)","match_type":"regex","name":"airflow.dagrun.duration.success","tags":{"dag_id":"$1"}},{"match":"airflow\\.dagrun\\.duration\\.failed\\.(.*)","match_type":"regex","name":"airflow.dagrun.duration.failed","tags":{"dag_id":"$1"}},{"match":"airflow\\.dagrun\\.schedule_delay\\.(.*)","match_type":"regex","name":"airflow.dagrun.schedule_delay","tags":{"dag_id":"$1"}},{"match":"airflow.dagrun.dependency-check","name":"airflow.dagrun.dependency_check"},{"match":"airflow.dagrun.duration.success","name":"airflow.dagrun.duration.success"},{"match":"airflow.dagrun.duration.failed","name":"airflow.dagrun.duration.failed"},{"match":"airflow.dagrun.schedule_delay","name":"airflow.dagrun.schedule_delay"},{"match":"airflow.dagrun.first_task_scheduling_delay","name":"airflow.dagrun.first_task_scheduling_delay"},{"match":"airflow.scheduler.tasks.running","name":"airflow.scheduler.tasks.running"},{"match":"airflow.scheduler.tasks.starving","name":"airflow.scheduler.tasks.starving"},{"match":"airflow.sla_email_notification_failure","name":"airflow.sla_email_notification_failure"},{"match":"airflow.sla_missed","name":"airflow.sla_missed"},{"match":"airflow.sla_callback_notification_failure","name":"airflow.sla_callback_notification_failure"},{"match":"airflow.scheduler.critical_section_query_duration","name":"airflow.scheduler.critical_section_query_duration"},{"match":"airflow.scheduler.scheduler_loop_duration","name":"airflow.scheduler.scheduler_loop_duration"},{"match":"airflow\\.task_removed_from_dag\\.(.*)","match_type":"regex","name":"airflow.dag.task_removed","tags":{"dag_id":"$1"}},{"match":"airflow\\.task_restored_to_dag\\.(.*)","match_type":"regex","name":"airflow.dag.task_restored","tags":{"dag_id":"$1"}},{"match":"airflow.task_instance_created-*","name":"airflow.task.instance_created","tags":{"task_class":"$1"}},{"match":"airflow.task_instance_created","name":"airflow.task.instance_created"},{"match":"airflow.task_instance_created_*","name":"airflow.task.instance_created","tags":{"operator_name":"$1"}},{"match":"airflow.task_removed_from_dag","name":"airflow.dag.task_removed"},{"match":"airflow\\.ti\\.start\\.(.+)\\.(\\w+)","match_type":"regex","name":"airflow.ti.start","tags":{"dag_id":"$1","task_id":"$2"}},{"match":"airflow\\.ti\\.finish\\.(\\w+)\\.(.+)\\.(\\w+)","match_type":"regex","name":"airflow.ti.finish","tags":{"dag_id":"$1","task_id":"$2","state":"$3"}},{"match":"airflow.ti.start","name":"airflow.ti.start"},{"match":"airflow.ti.finish","name":"airflow.ti.finish"},{"match":"airflow.celery.execute_command.failure","name":"airflow.celery.execute_command.failure"},{"match":"airflow.triggerer_heartbeat","name":"airflow.triggerer_heartbeat"},{"match":"airflow.triggers.blocked_main_thread","name":"airflow.triggers.blocked_main_thread"},{"match":"airflow.triggers.failed","name":"airflow.triggers.failed"},{"match":"airflow.triggers.succeeded","name":"airflow.triggers.succeeded"},{"match":"airflow.triggers.running","name":"airflow.triggers.running"},{"match":"airflow\\.triggers\\.running\\.(.*)","match_type":"regex","name":"airflow.triggers.running","tags":{"hostname":"$1"}},{"match":"airflow.dataset.updates","name":"airflow.dataset.updates"},{"match":"airflow.dataset.orphaned","name":"airflow.dataset.orphaned"},{"match":"airflow.dataset.triggered_dagruns","name":"airflow.dataset.triggered_dagruns"},{"match":"airflow.executor.open_slots","name":"airflow.executor.open_slots"},{"match":"airflow\\.executor\\.open_slots\\.(.*)","match_type":"regex","name":"airflow.executor.open_slots","tags":{"executor_class_name":"$1"}},{"match":"airflow.executor.queued_tasks","name":"airflow.executor.queued_tasks"},{"match":"airflow\\.executor\\.queued_tasks\\.(.*)","match_type":"regex","name":"airflow.executor.queued_tasks","tags":{"executor_class_name":"$1"}},{"match":"airflow.executor.running_tasks","name":"airflow.executor.running_tasks"},{"match":"airflow\\.executor\\.running_tasks\\.(.*)","match_type":"regex","name":"airflow.executor.running_tasks","tags":{"executor_class_name":"$1"}},{"match":"airflow.kubernetes_executor.adopt_task_instances.duration","name":"airflow.kubernetes_executor.adopt_task_instances.duration"},{"match":"airflow.kubernetes_executor.clear_not_launched_queued_tasks.duration","name":"airflow.kubernetes_executor.clear_not_launched_queued_tasks.duration"},{"match":"airflow.dag_processing.file_path_queue_size","name":"airflow.dag_processing.file_path_queue_size"},{"match":"airflow.dag_processing.file_path_queue_update_count","name":"airflow.dag_processing.file_path_queue_update_count"},{"match":"airflow\\.dag_processing\\.last_num_of_db_queries\\.(.*)","match_type":"regex","name":"airflow.dag_processing.last_num_of_db_queries","tags":{"dag_file":"$1"}},{"match":"airflow.dag_processing.other_callback_count","name":"airflow.dag_processing.other_callback_count"},{"match":"airflow.dag_processing.sla_callback_count","name":"airflow.dag_processing.sla_callback_count"},{"match":"airflow.dag_file_processor_timeouts","name":"airflow.dag_file_processor_timeouts"}]}]To add non-static tags to the StatsD metrics, you must use DogStatsD mapper profiles. See an example mapper profile that adds service and env tags.
Available for Agent versions >6.0
Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.
| Parameter | Value |
|---|---|
<LOG_CONFIG> |
{"source": "airflow", "service": "<YOUR_APP_NAME>"} |
Run the Agent's status subcommand and look for airflow under the Checks section.
In addition, Airflow DatadogHook can be used to interact with Datadog:
- Send Metric
- Query Metric
- Post Event
See metadata.csv for a list of metrics provided by this check.
Note: airflow.healthy, airflow.can_connect, airflow.dag.task.total_running, and airflow.dag.task.ongoing_duration metrics are collected from the Agent portion of the integration. All other metrics come from StatsD.
The Airflow check does not include any events.
See service_checks.json for a list of service checks provided by this integration.
You may need to configure parameters for the Datadog Agent to make authenticated requests to Airflow's API. Use one of the available configuration options.
Airflow can use its own StatsD library, as well the Datadog Python DogStatsD logger. Using the Datadog Python DogStatsD can provide extra tagging options, including Origin Detection in Kubernetes.
However, this does not come installed by default in Airflow. You need to install the Datadog provider package. For host installations, you can install it directly with pip install apache-airflow-providers-datadog.
For containerized environments, Airflow recommends to build a custom image with this package installed. For example, the following Dockerfile can be used relative to your desired version tag (ex: 2.8.4 or 3.0.2):
FROM apache/airflow:<VERSION>
RUN pip install apache-airflow-providers-datadog
After that is running, provide the environment variable to your Airflow containers to enable this:
- name: AIRFLOW__METRICS__STATSD_DATADOG_ENABLED
value: "true"Because this option switches Airflow from using the Airflow StatsD library to the Datadog DogStatsD library, this option supports Datadog tagging options, including Origin Detection out-of-the-box on the Airflow side. You need to enable Origin Detection on the Datadog Agent side to match.
If you try to enable the DogStatsD plugin without this package installed, no metrics are sent, and an error like the following occurs:
{stats.py:42} ERROR - Could not configure StatsClient: No module named 'datadog', using NoStatsLogger instead.
Need help? Contact Datadog support.