-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve: stream key derive: the hidden join key can be replaced by its vnode #12845
Comments
This is really a good catch. We may also apply this to over-window and group top-n. |
Do they have this problem? I think |
Yeah. Ideally, these operators may keep the primary key unchanged. We have to also include the group key / partition key to prevent disordering. BTW, what do you mean by "explicit"? 👀 |
When I was trying to implement this feature, I found that our current representation of stream_key Join schema (for inner join type) is simply a concatenation of LHS and RHS and there is no place for the vnode column, so we can't express stream_key with the vnode column. Another way to achieve this goal is by using a project and overwriting its stream key instead of deriving, however, the project could get merged by optimizer rules and we would easily lose this overwriting. It is too invasive to implement. I haven't got a good idea to implement this feature elegantly. Any better ideas? |
This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned. |
As we know, the join key must be included in the downstream stream key. The reason is:
However, actually we can use
vnode(join_key)
as a substitute, which is much shorter (only 16-bit) than thejoin_key
itself. This is because that the disordering only happen between different vnodes (or more accurately - different actors, but actor may be scaled up/down from time to time)The text was updated successfully, but these errors were encountered: