Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] adjust agg pushdown strategy for broadcast #54572

Merged

Conversation

stephen-shelby
Copy link
Contributor

@stephen-shelby stephen-shelby commented Dec 31, 2024

Why I'm doing:

the current agg pushdown only considers the degree of aggregation. but in the case of broadcast join and smaller right table, agg takes longer than join costs without pushing. so we need to adjust agg pushdown strategy.

What I'm doing:

Prioritize join cost before considering aggregation degree
we don't pushdown agg under following conditions:

  1. broadcast join
  2. right table less than session var cbo_push_down_aggregate_on_broadcast_join_row_count_limit. default value is 25w
  3. The ndv of any column exceeds 10w

ssb_100 16 concurrency

  base optimized
sum 114566 76243
Q01 1183 1053
Q02 800 922
Q03 775 839
Q04 13743 9014
Q05 13596 8121
Q06 7256 7161
Q07 16931 11076
Q08 10041 7509
Q09 6948 6958
Q10 937 949
Q11 20064 14907
Q12 6896 4481
Q13 15396 3253

tpcds 1t 1 concurrency

  base optimized
sum 459809 456555
QUERY01 868 887
QUERY02 1186 1288
QUERY03 1671 1102
QUERY04 20347 20313
QUERY05 646 782
QUERY06 219 264
QUERY07 1606 1582
QUERY08 219 339
QUERY09 11029 10981
QUERY10 402 422
QUERY11 11725 11838
QUERY12 99 158
QUERY13 961 997
QUERY14-1 10964 11072
QUERY14-2 10125 10366
QUERY15 704 718
QUERY16 800 828
QUERY17 1431 1469
QUERY18 1308 1358
QUERY19 331 341
QUERY20 123 198
QUERY21 94 123
QUERY22 2967 2945
QUERY23-1 67812 67362
QUERY23-2 67771 67887
QUERY24-1 4271 4215
QUERY24-2 4234 4217
QUERY25 1175 1216
QUERY26 830 830
QUERY27 1171 1170
QUERY28 10910 10889
QUERY29 2289 2278
QUERY30 435 466
QUERY31 3170 3127
QUERY32 155 171
QUERY33 368 398
QUERY34 1067 1068
QUERY35 2118 2088
QUERY36 1068 1070
QUERY37 535 316
QUERY38 6755 6581
QUERY39-1 483 510
QUERY39-2 264 301
QUERY40 181 203
QUERY41 33 106
QUERY42 128 134
QUERY43 697 717
QUERY44 3337 3379
QUERY45 651 663
QUERY46 1970 1975
QUERY47 5180 5019
QUERY48 833 850
QUERY49 790 913
QUERY50 5013 5002
QUERY51 7287 6789
QUERY52 132 158
QUERY53 1026 1056
QUERY54 710 332
QUERY55 127 136
QUERY56 223 272
QUERY57 3698 3703
QUERY58 275 338
QUERY59 4848 4835
QUERY60 412 451
QUERY61 485 516
QUERY62 1062 1072
QUERY63 1014 1030
QUERY64 9918 9826
QUERY65 8445 8347
QUERY66 709 953
QUERY67 52812 50033
QUERY68 684 693
QUERY69 389 406
QUERY70 4490 4493
QUERY71 2122 2094
QUERY72 2856 2716
QUERY73 457 467
QUERY74 10855 10307
QUERY75 12096 11631
QUERY76 2303 2275
QUERY77 358 423
QUERY79 2869 2908
QUERY80 989 1059
QUERY81 700 719
QUERY82 1058 957
QUERY83 193 190
QUERY84 269 289
QUERY85 749 803
QUERY86 1167 1191
QUERY87 6800 6911
QUERY88 16609 16855
QUERY89 1141 1171
QUERY90 975 1005
QUERY91 98 120
QUERY92 98 115
QUERY93 3878 3975
QUERY94 1181 1245
QUERY95 2726 2760
QUERY96 2303 2312
QUERY97 7707 7928
QUERY98 660 812
QUERY99 2327 2386

no performace downgrade before and after adjustment at tpcds_100g/tpch_100g/ssbflat_100g dataset.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@stephen-shelby stephen-shelby requested review from a team as code owners December 31, 2024 11:44
@stephen-shelby stephen-shelby force-pushed the update_aggpushdown_para branch from a33f129 to 0d5523b Compare January 2, 2025 10:27
@stephen-shelby stephen-shelby changed the title [WIP] adjust agg pushdown strategy for broadcast [Enhancement] adjust agg pushdown strategy for broadcast Jan 2, 2025
@stephen-shelby stephen-shelby force-pushed the update_aggpushdown_para branch from 0d5523b to 3220466 Compare January 2, 2025 13:07
Copy link

sonarqubecloud bot commented Jan 2, 2025

Copy link

github-actions bot commented Jan 2, 2025

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Jan 3, 2025

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Jan 3, 2025

[FE Incremental Coverage Report]

pass : 10 / 12 (83.33%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/qe/SessionVariable.java 2 4 50.00% [3616, 3617]
🔵 com/starrocks/sql/optimizer/rule/tree/pdagg/PushDownAggregateCollector.java 8 8 100.00% []

@stephen-shelby stephen-shelby merged commit 04ddf66 into StarRocks:main Jan 3, 2025
61 checks passed
Copy link

github-actions bot commented Jan 3, 2025

@Mergifyio backport branch-3.4

@github-actions github-actions bot removed the 3.4 label Jan 3, 2025
Copy link
Contributor

mergify bot commented Jan 3, 2025

backport branch-3.4

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jan 3, 2025
@stephen-shelby stephen-shelby deleted the update_aggpushdown_para branch January 3, 2025 05:06
@stephen-shelby
Copy link
Contributor Author

@Mergifyio backport branch-3.4.0-rc01

Copy link
Contributor

mergify bot commented Jan 3, 2025

backport branch-3.4.0-rc01

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants