You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current implementation of VectorBinding supports compression/decompression via GZIP and BZIP2. However, it might be useful to have more advanced compression methods as well, e.g. LZO, LZ4 or Snappy, since I/O and decompression are one of the bottlenecks in similarity-intensive applications. Some benchmarking results can be found here: http://catchchallenger.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO. At least LZO and Snappy provide Input/OutputStream objects and their Java implementations are on Maven Central, so it should be relatively easy to integrate. However, some knowledge of those libraries is required to get an optimal compression/speed ratio.
The text was updated successfully, but these errors were encountered:
I think that Apache Commons Compress supports most (all?) of the methods you are suggesting and using it here should make it pretty trivial to switch between different algorithms.
Current implementation of VectorBinding supports compression/decompression via GZIP and BZIP2. However, it might be useful to have more advanced compression methods as well, e.g. LZO, LZ4 or Snappy, since I/O and decompression are one of the bottlenecks in similarity-intensive applications. Some benchmarking results can be found here: http://catchchallenger.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO. At least LZO and Snappy provide Input/OutputStream objects and their Java implementations are on Maven Central, so it should be relatively easy to integrate. However, some knowledge of those libraries is required to get an optimal compression/speed ratio.
The text was updated successfully, but these errors were encountered: