Releases · mwlon/pcodec

02 Dec 20:33

mwlon

v0.1.1

4ca9841

v0.1.1

Improved standalone decompression speed ~5% by storing a size hint for the count of numbers in the whole file.
Due to the above, was able to reduce default chunk size at no performance cost, improving compression ratio.
Improved compression speed ~15% with optimized writer logic.
Substantially increased compression and decompression speed in special cases when steps can be skipped.

Assets 2

29 Nov 02:49

mwlon

v0.1.0

2b823d2

v0.1.0

Breaking changes
- format: replaced GCD mode with int mult mode. This simplifies the format (is very similar to float mult mode) and is more robust in the ways we care about. However, GCD-encoded data from v0.0.0 will no longer be decompressible. This could have been made as a backward-compatible change, but since v0.0.0 has reasonably few downloads and GCD data is rare, I decided it was better to break it rather than keep dead old code around forever. Int mult gets 11% better compression ratio on the total_cents bench dataset than GCD did.
- API: Removed GCD-related metadata such as Bin::gcd and replaced configurations with int mult equivalents.
- API: Renamed Progress.finished_page to Progress.finished since it sometimes refers to different units.
Improved decompression speed with SIMD offset reads.
Added standalone::simple_decompress_into.
Fixed a rare bug in compression that caused it to became lossy on nearly-linear sequences of floats with floating point errors.

Assets 2

15 Nov 13:01

mwlon

v0.0.0

1bb464f

v0.0.0

Improved decompression performance ~70% on aarch64, ~10% on x86_64.
Supported consuming any BetterBufRead implementation during decompression, rather than only &[u8]
Changed the API for wrapped::PageDecompressor and standalone::ChunkDecompressor to own src, since these parts of the file need to be read in order and contiguously.
Updated docs, including real-world benchmarks on air quality, taxi, and r/place datasets.

Assets 2

04 Nov 18:54

mwlon

v0.0.0-alpha.3

0729403

v0.0.0-alpha.3

With lower-level unit testing, found and fixed 3 serious bugs:

encoding more than one page per chunk failed; it tried to encode the whole chunk every time
decoding one batch at a time failed because the code path asserted the reader would be byte aligned
decoding with most limits through the CLI failed because it create a bad count of numbers for pco

Assets 2

29 Oct 03:39

mwlon

v0.0.0-alpha.2

8ecd212

v0.0.0-alpha.2

Revamped the API into separate structs for File, Chunk, and Page compressors/decompressors.
Fixed a known bug that caused panics for 32-bit architectures.
Made decompression almost-zero-copy, increasing performance slightly.
Made standalone actually just a minimal wrapped format with no access to private functionality.

Assets 2

04 Sep 15:43

mwlon

v0.0.0-alpha.1

10a7437

v0.0.0-alpha.1

Changed the format to contain tiny batches (256 numbers each) with contiguous 4-way interleaved tANS codes and contiguous offsets. This increased the buffer space needed, but allowed decent CPU utilization during tANS decoding and excellent SIMD utilization during offset decoding, approximately a 30% decompression speedup overall.

Assets 2

07 Jul 23:47

mwlon

v0.0.0-alpha.0

4b35004

v0.0.0-alpha.0

Unveiling the alpha of pco (the library) and pco_cli (the binary) for pcodec, a new format and codec for compressing numerical sequences. Its API is very similar to that of q_compress, but its compression ratios and decompression speeds are much better.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: mwlon/pcodec

v0.1.1

v0.1.0

v0.0.0

v0.0.0-alpha.3

v0.0.0-alpha.2

v0.0.0-alpha.1

v0.0.0-alpha.0