-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(udf): move UDF implementations to expr_impl
#16759
Conversation
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
85a3a41
to
dc71c16
Compare
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others generally LGTM. This also makes expr_udf
cleaner, and maybe makes it easier to add new UDF implementations. Cool!
default = ["rw-static-link"] | ||
rw-static-link = ["workspace-config/rw-static-link"] | ||
rw-dynamic-link = ["workspace-config/rw-dynamic-link"] | ||
embedded-deno-udf = ["risingwave_expr/embedded-deno-udf"] | ||
embedded-python-udf = ["risingwave_expr/embedded-python-udf"] | ||
default = ["rw-static-link"] | ||
all-udf = ["external-udf", "wasm-udf", "js-udf", "deno-udf", "python-udf"] | ||
external-udf = ["risingwave_expr_impl/external-udf"] | ||
wasm-udf = ["risingwave_expr_impl/wasm-udf"] | ||
js-udf = ["risingwave_expr_impl/js-udf"] | ||
deno-udf = ["risingwave_expr_impl/deno-udf"] | ||
python-udf = ["risingwave_expr_impl/python-udf"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBO I don't very like feature flags, but acceptable since there are already existing ones. 1 flag is as bad as N flags...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My consideration is that each UDF backend brings many dependencies, and most of our developers don't touch UDF at all. So it would be better to skip them in local development. The same thing could be applied to connectors. For example, I almost never use any connector in my development. Those dependencies could be skipped as well to save my disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think it's time to split connector implementations off and design a clear interface for it, as there are now community developers willing to develop new connectors. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I think it's time to split connector implementations off
BTW, #12981 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed this PR. It's a little pity it was not merged, but I think it's in the right direction.
#[derive(Debug, Clone)] | ||
struct MetricsVec { | ||
/// Number of successful UDF calls. | ||
success_count: IntCounterVec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use LabelGuardedIntCounterVec
, etc. #14838
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do it in the following PR.
Signed-off-by: Runji Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest looks amazing 🤩
@@ -71,8 +71,8 @@ if [ "${ARCH}" == "aarch64" ]; then | |||
# see https://github.com/tikv/jemallocator/blob/802969384ae0c581255f3375ee2ba774c8d2a754/jemalloc-sys/build.rs#L218 | |||
export JEMALLOC_SYS_WITH_LG_PAGE=16 | |||
fi | |||
cargo build -p risingwave_cmd_all --features "rw-static-link" --profile release | |||
cargo build -p risingwave_cmd --bin risectl --features "rw-static-link" --profile release | |||
cargo build -p risingwave_cmd_all --features "rw-static-link" --features all-udf --profile release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Does it mean that we used to release versions without support for embedded
UDF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm considering whether we should make it a default feature but always use --no-default-features
for RiseDev builds, so that we can avoid such mistakes in the future...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, the released version did not contain python UDF and deno UDF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But making these features as default will cause them to be built by rust-analyzer...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm considering whether we should make it a default feature but always use
--no-default-features
for RiseDev builds, so that we can avoid such mistakes in the future...
I don't think it's worth it:
- When release failed, we can just fix it. And it's ok if the script is dirty. We just need to get it done right once.
- But
default
features is to make developers happy, which can be hard to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But making these features as default will cause them to be built by rust-analyzer...
Can also suggest developers to set rust-analyzer.cargo.noDefaultFeatures
in the editor. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- But
default
features is to make developers happy, which can be hard to fix.
The UDF module is touched by few people, so it makes sense to require manual enabling on the features.
However, imagine we're going to add conditional compiles for the connector module, which is actively developed and maintained by a considerable amount of developers. We still need the effort to inform them how to enable these features, especially the IDE part, which is not better than the approach above.
The point is the most of developers do not pay much attention to the compile time. If something comes naturally, it's good. If it requires extra effort, they might give up. Thus in my opinion, it's fine to treat it as a hidden treasure needed to be discovered by some manual efforts. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is the most of developers do not pay much attention to the compile time. If something comes naturally, it's good. If it requires extra effort, they might give up.
I 100% agree. However, my opinion from this fact is that it's better to offer "free" fast compile time for developers. If they are not good at it, we'd better make the default faster. Saving every one's time does good to the whole world!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to enable these features, especially the IDE part
I'm thinking whether it's possible to offer IDE experience without config for feature flags. e.g., cfg_or_panic(madsim)
.
One idea is that: we put each connector impl in separated crate, which is an optional dependency. If so, maybe the IDE can work without config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go to #16841
Signed-off-by: Runji Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Runji Wang [email protected]I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR moves UDF implementations from
expr_core
toexpr_impl
and introduces adistributed_slice
inexpr_core
to collect them at link time. It makes it easier for developers to add new UDF languages. It also moves a lot dependencies out ofexpr_core
, which can reduce build time and improve DX.This PR makes each UDF a feature and they are not enabled by default to avoid building a lot of dependencies. For those who need to develop UDF, please enable them from
./risedev configure
.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.