Skip to content

Code repo for "NULLS! Revisiting Null Representation in Modern Columnar Formats" DaMoN24

License

Notifications You must be signed in to change notification settings

XinyuZeng/NULLS

Repository files navigation

NULLS

This repo contains code for the experiments in the DaMoN 2024 paper "NULLS! Revisiting Null Representation in Modern Columnar Formats"

Setup

Tested on Debian 11. Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz

sudo apt-get install libpfm4-dev just build-essential cmake python3

Python dependencies are in requirements.txt.

Code compiled in g++ 10.2.1

Steps to reproduce the results

Clone the project recursively.

git clone --recursive [email protected]:XinyuZeng/NULLS.git

First build the target for later use:

just build

The cost of nulls:

just motivate
python3 scripts/motivate.py

C->P Conversion methods:

just bench_dense_to_spaced

Placeholder-filling strategies:

python3 scripts/placeholder_filling_exp.py

Compact vs. Placeholder w/o AVX512 encodings:

python3 scripts/compact_vs_placeholder.py

Compact vs. Placeholder on FLS w/ AVX512:

python3 scripts/fls.py

Compact vs. Placeholder on a vector primitive:

just bench_full
just bench_sv

C->P Conversion Miniblock size optimization:

just bench_miniblock_size

Limitations

  • Codecs involving FLS currently only support 1024 block size.
  • Code with name prefix/suffix as "Dense" corresponds to "Compact" in the paper, and "Spaced" corresponds to "Placeholder" in the paper.

Cite

@inproceedings{DBLP:conf/damon/ZengMPMZ24,
  author       = {Xinyu Zeng and
                  Ruijun Meng and
                  Andrew Pavlo and
                  Wes McKinney and
                  Huanchen Zhang},
  editor       = {Carsten Binnig and
                  Nesime Tatbul},
  title        = {NULLS!: Revisiting Null Representation in Modern Columnar Formats},
  booktitle    = {Proceedings of the 20th International Workshop on Data Management
                  on New Hardware, DaMoN 2024, Santiago, Chile, 10 June 2024},
  pages        = {10:1--10:10},
  publisher    = {{ACM}},
  year         = {2024},
  url          = {https://doi.org/10.1145/3662010.3663452},
  doi          = {10.1145/3662010.3663452},
  timestamp    = {Fri, 21 Jun 2024 18:43:53 +0200},
  biburl       = {https://dblp.org/rec/conf/damon/ZengMPMZ24.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

About

Code repo for "NULLS! Revisiting Null Representation in Modern Columnar Formats" DaMoN24

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published