You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I configured a data source by connecting it to the Redpanda topic with the 'startup mode' set as 'latest.' However, I encountered an issue when querying the data. Despite having three days of data in my topic, the queries from the source consistently return data from the earliest records, not the latest ones. I'm puzzled about the purpose of specifying 'latest' as the startup mode in this scenario.
Error message/log
No response
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
No response
The version of RisingWave
No response
Additional context
Feedback from users.
The text was updated successfully, but these errors were encountered:
I think we don't support the semantics of latest in query batch source.
If my understand is correct, in streaming, latest means that we can see the data after materized veiw is created. E.g. data: 1 |create source| data: 2 | create materized view | data: 3
The materized view only can see the data 3. Because we fetch the partition offset when we actually create the materized view.
According to #6725, in batch, we fetch the partition offset every time the batch query comes in. We can't directly apply the latest in streaming because that will cause get empty data every time. So to support latest, I think we need to define latest semantics in batch source first.
E.g. latest in batch source query means that we can only see the data after create source, in above example, which means that we can see data 2 and data 3. data: 1 |create source| data: 2 | create materized view | data: 3
And then to support above semantics, maybe we should store the partition offset when we create the source.
Describe the bug
I configured a data source by connecting it to the Redpanda topic with the 'startup mode' set as 'latest.' However, I encountered an issue when querying the data. Despite having three days of data in my topic, the queries from the source consistently return data from the earliest records, not the latest ones. I'm puzzled about the purpose of specifying 'latest' as the startup mode in this scenario.
Error message/log
No response
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
No response
The version of RisingWave
No response
Additional context
Feedback from users.
The text was updated successfully, but these errors were encountered: