Skip to content

Commit

Permalink
transfer #2762
Browse files Browse the repository at this point in the history
  • Loading branch information
WanYixian committed Nov 22, 2024
1 parent 80a7f3f commit 671bb6c
Show file tree
Hide file tree
Showing 4 changed files with 212 additions and 1 deletion.
122 changes: 122 additions & 0 deletions integrations/sources/github-webhook.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
title: "Ingest data from GitHub Webhook"
description: "Describes how to use ingest data from GitHub Webhook to RisingWave."
sidebarTitle: GitHub Webhook
---

GitHub Webhooks allow you to build or set up integrations that subscribe to certain events on GitHub.com. When one of those events is triggered, GitHub sends an HTTP POST payload to the webhook's configured URL. Webhooks can be used to update an external issue tracker, trigger CI builds, update a backup mirror, or even deploy to your production server.

In this guide, we'll walk through the steps to set up RisingWave as a destination for GitHub Webhooks. This enables you to ingest GitHub events directly into your RisingWave database for real-time processing and analytics.

## Steps to Ingest Data from GitHub via Webhook

### 1. Create a Secret in RisingWave

First, create a secret in RisingWave to securely store a secret string. This secret will be used to validate incoming webhook requests from GitHub.

```sql
CREATE SECRET test_secret WITH (backend = 'meta') AS 'TEST_WEBHOOK';
```

Explanation:
- `test_secret`: The name of the secret.
- `'TEST_WEBHOOK'`: The secret string used for signing and verifying webhook payloads. Replace this with a secure, random string.

### 2. Create a Table in RisingWave to Receive Webhook Data

Next, create a table configured to accept webhook data from GitHub.

```sql
CREATE TABLE wbhtable (
data JSONB
) WITH (
connector = 'webhook'
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'x-hub-signature-256',
'sha256=' || encode(hmac(test_secret, data, 'sha256'), 'hex')
);
```

Explanation:
- `data JSONB`: Defines the name of column to store the JSON payload from the webhook. Currently, only `JSONB` type is supported for webhook tables.
- `headers->>'x-hub-signature-256'`: Extracts the signature provided by GitHub in the x-hub-signature-256 HTTP header. An example of the value is `sha256=f37a93a68fef1505d75e920a15d0543199557be72d2182e5cf8c15d7f9a6260f`. Note that in `secure_compare()` function, the whole HTTP header is interpreted as a JSONB object, and you can access the header value using the `->>` operator. But please only use the lower-case header names in the `->>` operator. The verification will fail, otherwise.
- `'sha256=' || encode(hmac(test_secret, data, 'sha256'), 'hex')`: Computes the expected signature by generating an HMAC SHA-256 hash of the payload (`data`) using the secret (`test_secret`), encodes it in hexadecimal, and prefixes it with `sha256=`.

The `secure_compare()` function compares the signature from the request header with the computed signature. If they match, the request is accepted; otherwise, it is rejected. This ensures that only authentic requests from GitHub are processed.

In GitHub Webhook, you can choose between SHA-1 and SHA-256 HMAC algorithms for signing the payload. The example above uses SHA-256 for demonstration purposes. If you want to use SHA-1, replace `x-hub-signature-256` with `x-hub-signature` and `sha256` with `sha1` in the `VALIDATE` clause. An example is here:

```sql
CREATE SECRET test_secret WITH ( backend = 'meta') AS 'TEST_WEBHOOK';
-- webhook table example github
create table wbhtable (
data JSONB
) WITH (
connector = 'webhook',
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'x-hub-signature',
'sha1=' || encode(hmac(test_secret, data, 'sha1'), 'hex')
);
```

### 3. Set Up the Webhook in GitHub

After configuring RisingWave to accept webhook data, set up GitHub to send events to your RisingWave instance.

#### RisingWave Webhook URL

The webhook URL should follow this format:
```
https://<HOST>/webhook/<database>/<schema_name>/<table_name>
```

Explanation:

- `<HOST>`: The hostname or IP address where your RisingWave instance is accessible. This could be a domain name or an IP address.
- `<database>`: The name of the RisingWave database where your table resides.
- `<schema_name>`: The schema name of your table, typically `public` unless specified otherwise.
- `<table_name>`: The name of the table you created to receive webhook data (e.g., `wbhtable` in the above example).

#### Configuring the Webhook in GitHub

1. Navigate to Your Repository Settings:

- Go to your GitHub repository.
- Click on the **Settings** tab.

2. Add a New Webhook:

- In the left sidebar, click on **Webhooks**.
- Click the **Add webhook** button.

3. Configure the Webhook Settings:

- **Payload URL**: Enter your RisingWave webhook URL.
- **Content type**: Select `application/json`.
- **Secret**: Enter the same secret string you used when creating the RisingWave secret (e.g., `'TEST_WEBHOOK'`). This ensures that GitHub signs the payloads using this secret, allowing RisingWave to validate them.
- **Which events would you like to trigger this webhook?**: Choose the events you want to subscribe to. For testing purposes, you might start with Just the push event.
- **Active**: Ensure the webhook is set to active.


4. Save the Webhook:
- Click the **Add webhook** button at the bottom of the page.

### 4. Push Data from GitHub via Webhook

With the webhook configured, GitHub will automatically send HTTP POST requests to your RisingWave webhook URL whenever the specified events occur (e.g., pushes to the repository). RisingWave will receive these requests, validate the signatures, and insert the payload data into the target table.

### 5. Further Event Processing
The data in the table is already ready for further processing. You can access the fields using `data->'field_name'` in SQL queries.
You can create a Materialized View to extract specific fields from the JSON payload.

```sql
CREATE MATERIALIZED VIEW github_events AS
SELECT
data->>'action' AS action,
data->'repository'->>'full_name' AS repository_name,
data->'sender'->>'login' AS sender_login,
data->>'created_at' AS event_time
FROM wbhtable;
```

You can now query `github_events` like a regular table to perform analytics, generate reports, or trigger further processing.
67 changes: 67 additions & 0 deletions integrations/sources/webhook.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: "Ingest data from Webhook"
sidebarTitle: Overview
description: Describes how to ingest data from Webhook to RisingWave
---

A webhook is a mechanism that allows one application to send real-time data to another application whenever a specific event occurs. Instead of continuously polling for updates, webhooks enable applications to receive immediate notifications, making data transfer more efficient and timely. They are commonly used for integrating different services, such as receiving updates from third-party platforms or triggering actions in response to specific events.

With the support of webhook source, RisingWave can act as a webhook destination. This means it can accept incoming HTTP requests from external services and store the data directly into its tables. When an event triggers a webhook from a source application, the data is sent to RisingWave, which then processes and ingests the information in real-time.

This capability eliminates the need for an intermediary message broker like Kafka. Instead of setting up and maintaining an extra Kafka cluster, you can directly send data to RisingWave, which is able to handle and process the data in real-time. This simplifies the architecture and reduces overhead, enabling efficient data ingestion and stream processing without additional infrastructure.

<Tip>
**PREMIUM EDITION FEATURE**

This feature is only available in the premium edition of RisingWave. The premium edition offers additional advanced features and capabilities beyond the free and community editions. If you have any questions about upgrading to the premium edition, please contact our sales team at [[email protected]](mailto:[email protected]).
</Tip>

<Note>
**PUBLIC PREVIEW**

This feature is in the public preview stage, meaning it's nearing the final product but is not yet fully stable. If you encounter any issues or have feedback, please contact us through our [Slack channel](https://www.risingwave.com/slack). Your input is valuable in helping us improve the feature. For more information, see our [Public preview feature list](/product-lifecycle/#features-in-the-public-preview-stage).
</Note>

## Creating a Webhook Table in RisingWave

To utilize webhook sources in RisingWave, you need to create a table configured to accept webhook requests. Below is a basic example of how to set up such a table:

```sql
CREATE SECRET test_secret WITH (backend = 'meta') AS 'secret_value';

CREATE TABLE wbhtable (
data JSONB
) WITH (
connector = 'webhook'
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'{header of signature}',
{signature generation expressions}
);
```

Explanation:

- `CREATE SECRET`: Securely stores a secret value (`'secret_value'`) in RisingWave, which can be used for validating incoming requests.
- `CREATE TABLE wbhtable`: Defines a new table named wbhtable with a single column data of type `JSONB` to store `JSON` payload from the webhook.
- `WITH (connector = 'webhook')`: Specifies that the table uses the webhook connector to accept incoming HTTP requests.
- `VALIDATE SECRET test_secret AS secure_compare(...)`: Uses the stored secret `test_secret` to authenticate incoming webhook requests by comparing the provided signature in the headers.
- - First Argument: `headers->>'signature header'` indicates the HTTP header key where the webhook sender places the generated signature. This retrieves the signature from the incoming request headers.
- - Second Argument: `signature_generation_expressions` should be an expression specified by the user to compute the expected signature based on the secret and payload data (and possibly other header values).

The `secure_compare(...)` function compares the signature provided in the request header with the computed signature. If they match, the request is considered authentic and is accepted; otherwise, it is rejected. This mechanism ensures that only verified requests from trusted sources are processed by RisingWave.

## Supported Webhook Sources and Authentication Methods
RisingWave has been verified to work with the following webhook sources and authentication methods:

|Webhook Source|Authentication Methods|
|---|---|
|GitHub| SHA-1 HMAC, SHA-256 HMAC |
|Rudderstack| Bearer Token |
|Segment| SHA-1 HMAC |
|AWS EventBridge| Bearer Token |
|HubSpot| API Key, Signature V2 |

<Note>While only the above sources have been thoroughly tested, RisingWave's existing functions are capable of supporting additional webhook sources and authentication methods. You can integrate other services using similar configurations, although they may not have been officially verified yet.</Note>

## Further Guidance
Detailed instructions and guides are available for integrating RisingWave with the verified webhook sources mentioned above. These guides provide step-by-step processes to help you set up and configure your webhook sources effectively.
7 changes: 7 additions & 0 deletions mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -704,6 +704,13 @@
"integrations/sources/emqx",
"integrations/sources/hivemq"
]
},
{
"group": "Webhook",
"pages": [
"integrations/sources/webhook",
"integrations/sources/github-webhook"
]
}
]
}
Expand Down
17 changes: 16 additions & 1 deletion sql/functions/cryptographic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Cryptographic functions"
description: "Raw encryption functions are basic encryption functions that perform encryption and decryption of data using cryptographic algorithms."
---

### `Raw encryption functions`
## `Raw encryption functions`

Please note they solely apply a cipher to the data and do not provide additional security measures.

Expand Down Expand Up @@ -56,3 +56,18 @@ SELECT decrypt('\x9cf6a49f90b3ac816aeeeed286606fdb','my_secret_key111', 'aes-cbc
(1 row)

```

## `hmac`

Returns the `HMAC` result regarding the input secret, payload and hash algorithm. Please refer to [`HMAC`](https://en.wikipedia.org/wiki/HMAC) for more information in cryptography. Currently, the supported hash algorithms for `hash_algo` are `sha1` and `sha256`.

```sql Syntax
hmac (secret varchar, payload bytea, hash_algo varchar) -> signature bytea
```

```sql Example
SELECT hmac('secret', 'payload'::bytea, 'sha256');
----RESULT
\xb82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4
(1 row)
```

0 comments on commit 671bb6c

Please sign in to comment.