You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In data analysis scd is an important concept that has applications in many cases. Currently risingwave supports two main sink types: append and upsert. I think sink type = scd type 2 will be a superior feature compared to other applications.
for example mysql cdc => rw => iceberg/doris/starrocks/bigquery (scd type2) => this is awesome.
Describe the solution you'd like
I think the Solution could be: source table must have ID column. sink table has additional columns __rw_valid_flag, __rw_valid_from, __rw_valid_to.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
-- your original cdc tablecreatetablet(pk intprimary key, v int);
-- transform the cdc table to stream changelog -- and add a timestamp columncreatetablet_changelog(pk int, v int, inserted_at timestamptzas proctime());
create sink s into t_changelog as with c as changelog from t select pk, v from c where changelog_op =1or changelog_op =3 with (type ='append-only');
-- final scd type 2 sink to your system
CREATE SINK s_scd2 ASselect*, end_at is nullas is_valid from (
select*, lag(inserted_at) over (partition by pk order by inserted_at desc) as end_at
from t_changelog
)
WITH (...);
with sample data:
dev=>insert into t values (1,2);
INSERT 01
dev=>insert into t values (1,3);
INSERT 01
dev=>insert into t values (1,4);
INSERT 01
dev=>insert into t values (2,1);
INSERT 01
dev=>select*, end_at is nullas is_valid from (
select*, lag(inserted_at) over (partition by pk order by inserted_at desc) as end_at
from t_changelog
);
┌────┬───┬───────────────────────────────┬───────────────────────────────┬──────────┐
│ pk │ v │ inserted_at │ end_at │ is_valid │
├────┼───┼───────────────────────────────┼───────────────────────────────┼──────────┤
│ 1 │ 4 │ 2024-12-2603:14:55.993+00:00 │ ∅ │ t │
│ 1 │ 3 │ 2024-12-2603:14:54.244+00:00 │ 2024-12-2603:14:55.993+00:00 │ f │
│ 1 │ 2 │ 2024-12-2603:14:51.744+00:00 │ 2024-12-2603:14:54.244+00:00 │ f │
│ 2 │ 1 │ 2024-12-2603:14:59.244+00:00 │ ∅ │ t │
└────┴───┴───────────────────────────────┴───────────────────────────────┴──────────┘
(4 rows)
Is your feature request related to a problem? Please describe.
In data analysis scd is an important concept that has applications in many cases. Currently risingwave supports two main sink types: append and upsert. I think sink type = scd type 2 will be a superior feature compared to other applications.
for example mysql cdc => rw => iceberg/doris/starrocks/bigquery (scd type2) => this is awesome.
Describe the solution you'd like
I think the Solution could be: source table must have ID column. sink table has additional columns __rw_valid_flag, __rw_valid_from, __rw_valid_to.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: