Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support all TPC-H queries #796

Merged
merged 94 commits into from
Apr 15, 2024
Merged

feat: support all TPC-H queries #796

merged 94 commits into from
Apr 15, 2024

Conversation

wangrunji0408
Copy link
Member

@wangrunji0408 wangrunji0408 commented Aug 11, 2023

This PR adds support for all remaining TPC-H queries.

The main change is to support correlated subqueries in expressions. The planner and optimizer design is following the article SQL 子查询的优化. Other minor changes include:

  • support COPY query result into file
  • support count(distinct ..) aggregation
  • support {nested loop, hash} x {semi, anti} join
  • optimize HashJoinExecutor, do not collect all input chunks at the beginning.
  • fix in predicate and projection pushdown.

A quick benchmark compared with DuckDB (notice the log-scale):

risinglight-tpch-duckdb
Full benchmark result
ms RisingLight DuckDB
Q1 1576 45
Q2 404 12
Q3 325 19
Q4 265 32
Q5 577 20
Q6 131 6
Q7 1821 48
Q8 2591 22
Q9 748 63
Q10 546 63
Q11 79 5
Q12 286 15
Q13 408 51
Q14 152 13
Q15 118 17
Q16 90 20
Q17 3947 56
Q18 2459 88
Q19 436 32
Q20 1458 42
Q21 6690 75
Q22 94 16

Signed-off-by: Runji Wang <[email protected]>
but the result is empty, test data needs update

Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
This was referenced Mar 12, 2024
github-merge-queue bot pushed a commit that referenced this pull request Mar 18, 2024
This PR adds support for specifying CTEs using the `WITH` clause. We
treat CTEs just as named subqueries. The implementation simply adds CTE
queries to the binder context and inlines them when referenced.

However, due to an optimizer bug (which has been fixed in #796), the
current CTE is pretty useless as most practice use cases will lead to
panics.

---------

Signed-off-by: Runji Wang <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Mar 19, 2024
This PR is a part of #796, adds support for creating, querying and
dropping views in memory.

The key implementations are: 
1. When creating a view, bind the query and store the logical plan with
the view in catalog.
2. When querying from a view, build executors for all views and then
build other plan nodes on top of them. Given that a view can be consumed
by multiple downstream nodes, we introduce `StreamSubscriber` to allow
multiple consumers of a stream.

Limitations:
1. We don't persist views in disk storage.
2. We don't support inferring schema from the query. Columns must be
defined explicitly when creating a view.
3. We don't maintain dependency relationship between tables and views.

---------

Signed-off-by: Runji Wang <[email protected]>
@wangrunji0408 wangrunji0408 force-pushed the wrj/correlated-subquery branch from 52ae955 to 9e3a647 Compare March 21, 2024 11:13
@wangrunji0408 wangrunji0408 marked this pull request as ready for review April 15, 2024 11:03
@wangrunji0408 wangrunji0408 force-pushed the wrj/correlated-subquery branch from df4b409 to 1a5bf9b Compare April 15, 2024 13:56
@wangrunji0408 wangrunji0408 added this pull request to the merge queue Apr 15, 2024
Merged via the queue into main with commit c4252aa Apr 15, 2024
4 checks passed
@wangrunji0408 wangrunji0408 deleted the wrj/correlated-subquery branch April 15, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants