Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add airflow manifests (#1412) #23

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions manifests/airflow.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
id: airflow
name: Airflow
description: |
Airflow integration
blocks:
- title: Overview
content: |+
The Airflow integration significantly enhances data discovery and observability capabilities for Airflow workflows,
providing an efficient and powerful solution for users seeking to optimize their data management experience.
## Utilizing ODD Airflow Collector with Airflow
To fully harness the potential of OpenDataDiscovery in conjunction with Airflow,
users should utilize the [ODD Airflow Collector](https://github.com/opendatadiscovery/odd-airflow-adapter).


> It is essential to understand that since the ODD Airflow Collector does not execute additional analytical queries against the actual data,
some metrics may not reflect precise values.
For instance, the `rows_count` metric might provide an estimated number of rows in a table instead of an exact figure
- title: Configure
content: |+
## ODD Platform
To integrate ODD Airflow Collector with the ODD Platform, users must first establish a collector entity within the platform by following these steps:

1. Navigate to the "Management" section and select "Collectors".
2. Click on the "Add Collector" button to initiate the process.
3. Input a unique name for the collector and, optionally, provide a namespace to further categorize and organize the entity. Additionally, you may include a brief description to give more context and information about the collector.
4. Click "Save" to finalize the setup, and take note of the generated token. This token will be required for incorporating it into the ODD Airflow Collector's configuration YAML file, ensuring secure communication between the two components.

## ODD Airflow Collector
The process of configuring the ODD Airflow Collector involves creating and customizing a single, well-structured YAML configuration file

````yaml
# ODD Platform's host URL
platform_host_url: https://your.odd.platform

# Default pulling interval in minutes
default_pulling_interval: 10

# Collector's specific security token
token: ""

plugins:
- type: airflow
# Data source's name
name: airflow_adapter
# Optional Data source's description
description: "Airflow sample workflows"
# Airflow host
host: host
# Airflow port
port: port
# Airflow user
user: user
# Airflow user password
password: password
````
snippets:
- template: |+
````yaml
platform_host_url: {{ platform_url }}
default_pulling_interval: 10
token: <Security token>
plugins:
- type: airflow
name: {{ ds_name }}
description: {{ plugin_description }}
host: {{ plugin_host }}
port: {{ plugin_port }}
user: {{ plugin_user }}
password: <Password for {{ plugin_user }}>
````
arguments:
- parameter: platform_url
type: STRING
static: true
- parameter: ds_name
name: Data source name
type: STRING
- parameter: plugin_description
name: Data source description
type: STRING
- parameter: plugin_host
name: Host
type: STRING
- parameter: plugin_port
name: Port
type: INTEGER
- parameter: plugin_user
name: User
type: STRING
89 changes: 89 additions & 0 deletions manifests/airflow2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
id: airflow2
name: Airflow 2
description: |
Airflow 2 integration
blocks:
- title: Overview
content: |+
The Airflow 2 integration significantly enhances data discovery and observability capabilities for Airflow 2 workflows,
providing an efficient and powerful solution for users seeking to optimize their data management experience.
## Utilizing ODD Airflow 2 Collector with Airflow 2
To fully harness the potential of OpenDataDiscovery in conjunction with Airflow 2,
users should utilize the [ODD Airflow 2 Collector](https://github.com/opendatadiscovery/odd-airflow-2).


> It is essential to understand that since the ODD Airflow 2 Collector does not execute additional analytical queries against the actual data,
some metrics may not reflect precise values.
For instance, the `rows_count` metric might provide an estimated number of rows in a table instead of an exact figure
- title: Configure
content: |+
## ODD Platform
To integrate ODD Airflow 2 Collector with the ODD Platform, users must first establish a collector entity within the platform by following these steps:

1. Navigate to the "Management" section and select "Collectors".
2. Click on the "Add Collector" button to initiate the process.
3. Input a unique name for the collector and, optionally, provide a namespace to further categorize and organize the entity. Additionally, you may include a brief description to give more context and information about the collector.
4. Click "Save" to finalize the setup, and take note of the generated token. This token will be required for incorporating it into the ODD Airflow 2 Collector's configuration YAML file, ensuring secure communication between the two components.

## ODD Airflow 2 Collector
The process of configuring the ODD Airflow 2 Collector involves creating and customizing a single, well-structured YAML configuration file

````yaml
# ODD Platform's host URL
platform_host_url: https://your.odd.platform

# Default pulling interval in minutes
default_pulling_interval: 10

# Collector's specific security token
token: ""

plugins:
- type: airflow2
# Data source's name
name: airflow2_adapter
# Optional Data source's description
description: "Airflow 2 sample workflows"
# Airflow 2 host
host: host
# Airflow 2 port
port: port
# Airflow 2 user
user: user
# Airflow 2 user password
password: password
````
snippets:
- template: |+
````yaml
platform_host_url: {{ platform_url }}
default_pulling_interval: 10
token: <Security token>
plugins:
- type: airflow2
name: {{ ds_name }}
description: {{ plugin_description }}
host: {{ plugin_host }}
port: {{ plugin_port }}
user: {{ plugin_user }}
password: <Password for {{ plugin_user }}>
````
arguments:
- parameter: platform_url
type: STRING
static: true
- parameter: ds_name
name: Data source name
type: STRING
- parameter: plugin_description
name: Data source description
type: STRING
- parameter: plugin_host
name: Host
type: STRING
- parameter: plugin_port
name: Port
type: INTEGER
- parameter: plugin_user
name: User
type: STRING