-
Notifications
You must be signed in to change notification settings - Fork 67
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pqarrow: centralize ReadValues when writing a page (#917)
Currently, when converting a parquet page to an arrow record, all the writers would repeat the slow path of allocating a parquet.Values slice, read all values, and write them to their underlying builder. However, this code already existed one level above and is more efficient since it reuses a parquet.Values slice. This commit removes the repetition from the writers and leaves only the concrete implementation of writing existing values to an arrow builder. Callers can also check if the ValueWriter implements the PageWriter interface, which can also offer a fast path for writing a parquet page directly. The improvement is especially noticeable in Query/Values since the slow path would previously fall back to write all the page values, rather than just the dictionary values. ``` │ benchmain │ benchpw │ │ sec/op │ sec/op vs base │ Query/Types-12 109.9m ± 1% 109.7m ± 2% ~ (p=0.353 n=10) Query/Labels-12 219.5µ ± 1% 214.1µ ± 2% -2.46% (p=0.011 n=10) Query/Values-12 7716.3µ ± 3% 207.7µ ± 4% -97.31% (p=0.000 n=10) Query/Merge-12 223.1m ± 2% 220.6m ± 1% -1.08% (p=0.035 n=10) Query/Range-12 117.5m ± 1% 116.1m ± 2% ~ (p=0.218 n=10) Query/Filter-12 9.888m ± 3% 10.025m ± 4% ~ (p=0.684 n=10) geomean 19.08m 10.38m -45.58% │ benchmain │ benchpw │ │ B/op │ B/op vs base │ Query/Types-12 254.3Mi ± 1% 252.1Mi ± 3% ~ (p=0.353 n=10) Query/Labels-12 400.6Ki ± 0% 400.7Ki ± 0% ~ (p=0.796 n=10) Query/Values-12 12644.7Ki ± 0% 853.5Ki ± 0% -93.25% (p=0.000 n=10) Query/Merge-12 574.7Mi ± 1% 576.6Mi ± 1% ~ (p=0.247 n=10) Query/Range-12 212.0Mi ± 0% 212.0Mi ± 0% ~ (p=0.190 n=10) Query/Filter-12 13.52Mi ± 0% 13.52Mi ± 0% ~ (p=0.739 n=10) geomean 35.56Mi 22.67Mi -36.25% │ benchmain │ benchpw │ │ allocs/op │ allocs/op vs base │ Query/Types-12 64.32k ± 6% 64.30k ± 4% ~ (p=0.631 n=10) Query/Labels-12 1.802k ± 0% 1.802k ± 0% ~ (p=0.840 n=10) Query/Values-12 3.677k ± 0% 2.192k ± 0% -40.37% (p=0.000 n=10) Query/Merge-12 1.435M ± 0% 1.435M ± 0% ~ (p=0.424 n=10) Query/Range-12 174.3k ± 0% 174.2k ± 0% -0.00% (p=0.044 n=10) Query/Filter-12 4.255k ± 0% 4.255k ± 0% ~ (p=0.487 n=10) geomean 27.72k 25.43k -8.25% │ benchmain │ benchpw │ │ B/msec │ B/msec vs base │ Query/Filter-12 3.238Mi ± 3% 3.194Mi ± 4% ~ (p=0.724 n=10) ```
- Loading branch information
Showing
2 changed files
with
78 additions
and
168 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters