Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(secret): introduce secret management #17456

Merged
merged 22 commits into from
Jul 15, 2024
Merged

Conversation

yuhao-su
Copy link
Contributor

@yuhao-su yuhao-su commented Jun 26, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

  • Create/Drop secret
    On create/drop secret, frontend will run command on meta and meta will distribute the plaintext secret to meta, frontend and compute node via notification service.
    image

  • Handle ref secret in frontend binder

  1. with properties was parsed to WithOptions (src/connector/src/with_options.rs), which contains two parts, non-ref-secret properties and ref-secret properties.
  2. secret properties in WithOptions was resolved (bound) to WithOptionsSecResolved, in which process ObjectName was resolved to PbSecretRef.
  3. WithOptionsSecResolved will be used in binder to check properties in binder just as WithOptions used to do. It will also be used to build connections in frontend to get schema registry and iceberg schema etc..
  4. WithOptionsSecResolved will into_part() to properties map and secret properties map and used by source/sink catalogs and further used by plan nodes (Source/FsSource/Sink etc..) in planner.
  • Fill secret
    Secret will be filled from SecretManager and merge with other properties in fill_secrets.
  1. In source connector: ConnectorProperties::extract is used to convert btreemap to ConnectorProperties, now we fill the secret in that function with one exception ExternalTableConfig.

  2. In sink connector: SinkParam is used for sink connectors. We fill the secret before create the SinkParam

TODO:

  • Remove the Deref BTreeMap for WithOptions and WithOptionsSecResolve. One of the reason is if we ref secret for properties like connection, RW will yield connection not found since it's in secret_refs, but currently WithOptions::get will only return those in inner
  • use a concrete type for properties with filled secrets. This is to avoid forgetting to use secret ref.
  • ACL
  • more test

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Secret Management

create secret

CREATE SECRET secret-name WITH ( backend = 'meta') AS 'your-secret';
Currently only meta backend is supported

DROP SECRET secret-name;

example:

CREATE SECRET mysql_pwd WITH (
  backend = 'meta'
) AS '123';

use secret

Use SECRET your-secret-name as the option value in the with clause.

example:

create source mysql_source with (
 connector = 'mysql-cdc',
 hostname = 'localhost',
 port = '8306',
 username = 'rwcdc',
 password = secret mysql_pwd,
 database.name = 'test',
 server.id = '5601'
);

Use as file

Convert a secret stored to a file path. The file content is the secret stored.

CREATE TABLE district (
    d_id INTEGER,
    PRIMARY KEY (d_id, d_w_id)
) with (
    connector = 'kafka',
    topic = 'your-topic',
    properties.bootstrap.server = 'your-broker-address:29092',
    ssl.ca.location =  SECRET kafka_ca AS FILE,
    ssl.certificate.location = SECRET kafka_cert AS FILE,
    ssl.key.location = SECRET kafka_key AS FILE,
   ssl.key.password = SECRET kafka_password,
) FORMAT DEBEZIUM ENCODE JSON;

To use secret management

you need to set the environment variable RW_SECRET_STORE_PRIVATE_KEY_HEX as a hex representation of a 128-bit key. e.g. 0123456789abcdef0123456789abcdef

Set the temporary secret file directory RW_TEMP_SECRET_FILE_DIR. It will only be used when using as file

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@yuhao-su yuhao-su requested a review from a team as a code owner June 26, 2024 00:04
Copy link

gitguardian bot commented Jun 26, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@graphite-app graphite-app bot requested a review from a team June 26, 2024 00:04
@yuhao-su yuhao-su changed the title WIP: introduce secret management WIP: feat(secret): introduce secret management Jun 26, 2024
Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about separating into 2 PRs (or 2 commits), one for backend (this part might not be testable, but at least it will be more reviewable) and another one for frontend?

@yuhao-su
Copy link
Contributor Author

yuhao-su commented Jun 26, 2024

What about separating into 2 PRs (or 2 commits), one for backend (this part might not be testable, but at least it will be more reviewable) and another one for frontend?

I understand this is too big to review. But it can be hard to separate it because some data types are used as concrete type for both frontend and backend. Also I might not be that helpful for review considering filling secret are used in both frontend and backend (if you consider compute node as backend) and also meta node.

I'll add more implementation detail/structure in the PR description. I'll also add comments about those who might raise concerns and guide for review.

@yuhao-su
Copy link
Contributor Author

It will be great if we RW can know about if it's a test environment. We need a test encryption key for secret manger or we need to pass the key env var everywhere we start the test. cc @xxchan @BugenZhao

@xxchan
Copy link
Member

xxchan commented Jun 26, 2024

But it can be hard to separate it because some data types are used as concrete type for both frontend and backend. Also I might not be that helpful for review considering filling secret are used in both frontend and backend (if you consider compute node as backend) and also meta node.

Oh, I got it. Do you mean things like ConnectorProperties is used in both fe, be, and meta..? That's indeed tricky and cannot be separated cleanly.

I'm thinking if there are still relatively self-contained things can be splitted out, e.g., the meta's RPC and storage of secrets.

e.g., is this possible:

  1. change .proto and all related things. types in connector are not changed. I guess this mainly affects meta
  2. change connector, and every where, fe, meta and compute
  3. change user-facing part, i.e., parser and fe handler

src/common/src/lib.rs Outdated Show resolved Hide resolved
@tabVersion
Copy link
Contributor

It will be great if we RW can know about if it's a test environment. We need a test encryption key for secret manger or we need to pass the key env var everywhere we start the test. cc @xxchan @BugenZhao

I think it can be done by specifying a test private key? We will remind the users to change it before going into prod.

@BugenZhao
Copy link
Member

It will be great if we RW can know about if it's a test environment. We need a test encryption key for secret manger or we need to pass the key env var everywhere we start the test. cc @xxchan @BugenZhao

Opened an issue: #17465

@yuhao-su
Copy link
Contributor Author

Do you mean things like ConnectorProperties is used in both fe, be, and meta..?

Not necessarily ConnectorProperties, but secrets need to be filled in both fe, be, and meta (e.g. for validation, schema registry, build reader...)

change .proto and all related things. types in connector are not changed. I guess this mainly affects meta
change connector, and every where, fe, meta and compute
change user-facing part, i.e., parser and fe handler

I managed to do 1 in #17474. But 2 and 3 are a little tricky. I need to use WithOptionsSecResolved and new WithOptions in fe handler, which propagates pretty wildly. Also filling secret also relies on WithOptionsSecResolved. So I guess it's probably not worth doing that. I think we can try start review after I add the review guide. I'll see what I can do if you still find it hard to review.

commit 0646dcf491390901a955895be2c8f26bf0f2319d
Author: Yuhao Su <[email protected]>
Date:   Wed Jun 26 12:17:11 2024 -0500

    fix test

commit ee935f5
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 22:30:52 2024 -0500

    fix

commit 73d8614
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 20:01:29 2024 -0500

    improve error

commit affdccb
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 19:19:03 2024 -0500

    fix check

commit eab7ff4
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 19:13:19 2024 -0500

    fix check

commit 2fa532b
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 19:03:24 2024 -0500

    clippy

commit 4653e00
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 15:56:17 2024 -0500

    toml

commit c7e776f
Merge: ebcbc4f 2adcca3
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 15:52:31 2024 -0500

    conflict

commit ebcbc4f
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 15:40:41 2024 -0500

    fmt

commit 0df138c
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 25 15:40:12 2024 -0500

    fix

commit 2002dbc
Author: Yuhao Su <[email protected]>
Date:   Fri Jun 21 12:20:27 2024 -0500

    fix

commit 8bec501
Merge: bb5edb4 af8f9a5
Author: Yuhao Su <[email protected]>
Date:   Thu Jun 20 11:17:14 2024 -0500

    resolve conflict

commit bb5edb4
Author: Yuhao Su <[email protected]>
Date:   Thu Jun 20 11:16:22 2024 -0500

    remove inner

commit a8b4778
Author: Yuhao Su <[email protected]>
Date:   Wed Jun 19 20:28:27 2024 -0500

    fix

commit 5a0bc05
Merge: 2a484ad 2a413ff
Author: Yuhao Su <[email protected]>
Date:   Wed Jun 19 11:52:50 2024 -0500

    resolve

commit 2a484ad
Author: Yuhao Su <[email protected]>
Date:   Tue Jun 18 19:39:07 2024 -0500

    xxx

commit 7151ec5
Merge: e9c466b d5e10d8
Author: Yuhao Su <[email protected]>
Date:   Mon Jun 17 15:58:16 2024 -0500

    resolve

commit e9c466b
Author: Yuhao Su <[email protected]>
Date:   Sun Jun 16 21:32:10 2024 -0500

    init
@yuhao-su yuhao-su force-pushed the yuhao/create-secret-p2 branch from ee935f5 to b4920f2 Compare June 27, 2024 17:26
@yuhao-su
Copy link
Contributor Author

yuhao-su commented Jul 2, 2024

We will remind the users to change it before going into prod.

If they forget, we don't support alter the key

@yuhao-su yuhao-su changed the title WIP: feat(secret): introduce secret management feat(secret): introduce secret management Jul 2, 2024
@yuhao-su yuhao-su requested a review from wenym1 July 2, 2024 23:49
@yuhao-su yuhao-su requested review from fuyufjh and xxchan July 8, 2024 19:24
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will approve once the following comment gets resolved.

/// Initialize the secret manager with the given temp file path, cluster id, and encryption key.
/// # Panics
/// Panics if fail to create the secret file directory.
pub fn init(temp_file_dir: Option<String>, cluster_id: String) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Continue from #17456 (comment))

Introducing the cluster_id just for simulation tests is unacceptable to me. :(

Maybe hack in risingwave_simulation::start to add a suffix?

It's not a hack. IIUC, simulation test runs multiple RW instance at the same time, so it should prepare separated working folders for them.

Copy link
Contributor Author

@yuhao-su yuhao-su Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a problem not just limited to test. Users can 1. start multiple RW clusters on 1 machine. 2. start multiple nodes on 1 machine.

Making users config a distinct path for each cluster and each node is not ideal. Misconfiging this can cause unexpected race on 1 directory. So not just cluster id, I actually think we should also add node id.

Such race condition can also happens if we have file cache later.

Copy link
Member

@fuyufjh fuyufjh Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making users config a distinct path for each cluster and each node is not ideal.

Hmmm, I think this is quite natural. Image that you are running multiple MySQL/PG service in one Linux machine, you can't use the default data dir /var/mysql/. Instead, you must manually create and provide multiple data dir to them.

On the other hand, our simulation test framework was developed early when RisingWave doesn't use local disk at all, so it overlooked this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we need a way like lock file to make sure a dir won't be shared by two RW instance

Not only different RW instances, different working nodes and their replication can be potentially deployed on a single machine.

Using RW cluster id to distinguish different RW instances can be used in other circumstances like connection name when two RW instance connection to the same up/downstream. So I guess users shouldn't be surprised to find a dir with uuid name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the same question. Shall we accept the hypothesis that tempfile's name seldom collide?


In my design, a secret file on CN is never reused so we'd pass the content in each request.
Using a local manager to keep the files live long enough is okay, but why we need to take the local one as the ground truth.

Copy link
Contributor Author

@yuhao-su yuhao-su Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in each request

If you mean the request from meta to cn, we have a lot of those request including altering/create/recovery... also frontend need secret files.

seldom collide

It can still happen. Do we have any reason not to handle this case?

@yuhao-su yuhao-su added ci/run-e2e-sql-migration-tests user-facing-changes Contains changes that are visible to users labels Jul 11, 2024
proto/meta.proto Show resolved Hide resolved
src/connector/src/sink/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@tabVersion tabVersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirm: we are distributing encrypted files to nodes, right?

@yuhao-su
Copy link
Contributor Author

just confirm: we are distributing encrypted files to nodes, right?

More precisely, we are distributing file content to node. And form files on each node.

@yuhao-su yuhao-su added this pull request to the merge queue Jul 15, 2024
Merged via the queue into main with commit 96de664 Jul 15, 2024
38 of 39 checks passed
@yuhao-su yuhao-su deleted the yuhao/create-secret-p2 branch July 15, 2024 16:48
Comment on lines +309 to +312
let options_with_secret = match with_properties {
Either::Left(options) => resolve_secret_ref_in_with_options(options.clone(), session)?,
Either::Right(options_with_secret) => options_with_secret.clone(),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuhao-su is there a reason why introducing an Either here? I see the right branch is only called in refresh_sr_and_get_columns_diff, why don't we need to resolve props coming from the func?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that function is using a WithOptionsSecResolved type option from SourceCatalog. That means it's already been resolved before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants