Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompress Example #312

Closed
EricFecteau opened this issue Aug 23, 2022 · 8 comments
Closed

Decompress Example #312

EricFecteau opened this issue Aug 23, 2022 · 8 comments
Labels

Comments

@EricFecteau
Copy link

EricFecteau commented Aug 23, 2022

Receiving zlib raw stream data from python with the "zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -zlib.MAX_WBITS)" options is not rare, but there are no examples of how to use the "Decompress" object in the example folder and there is no mention of the "Decompress::new_with_window_bits" in the documentation. From my understanding, there is no other way to inflate the below example with this library (since the second "message" is dependent on the first).

Could documentation on the "new_with_window_bits" be added and maybe an example (similar to the one below) be added to the examples?

Python Code for Generating example:

import zlib

compressobj = zlib.compressobj(
    zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -zlib.MAX_WBITS
)

message = b'{"msgs":[{"msg": "ping"}]}'
compressed = compressobj.compress(message)
compressed += compressobj.flush(zlib.Z_SYNC_FLUSH)
compressed = compressed[:-4]
print([c for c in compressed])

message = b'{"msgs":[{"msg": "lobby_clear"},{"msg": "lobby_complete"}]}'
compressed = compressobj.compress(message)
compressed += compressobj.flush(zlib.Z_SYNC_FLUSH)
compressed = compressed[:-4]
print([c for c in compressed])

Rust Code for inflating the above:

use flate2::{Decompress, FlushDecompress};
use std::str;

fn main() {
    // Python ZLIB compressed with options: zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -zlib.MAX_WBITS

    // b'{"msgs":[{"msg": "ping"}]}'
    let msg_vec1 = vec![
        170, 86, 202, 45, 78, 47, 86, 178, 138, 174, 6, 49, 148, 172, 20, 148, 10, 50, 243, 210,
        149, 106, 99, 107, 1, 0, 0, 0, 255, 255,
    ];

    // b'{"msgs":[{"msg": "lobby_clear"},{"msg": "lobby_complete"}]}'
    let msg_vec2 = vec![
        170, 198, 144, 201, 201, 79, 74, 170, 140, 79, 206, 73, 77, 44, 82, 170, 213, 65, 23, 206,
        207, 45, 200, 73, 45, 73, 5, 105, 5, 0,
    ];

    let wbits = 15; // Windows bits (goes to -15 in flate2 because of zlib_header = false)
    let bufsize = 32 * 1024;

    let mut decompressor = Decompress::new_with_window_bits(false, wbits);
    let mut decoded_bytes = Vec::with_capacity(bufsize); // with_capacity mandatory, or else err "invalid distance too far back"

    decompressor
        .decompress_vec(&msg_vec1[..], &mut decoded_bytes, FlushDecompress::Finish)
        .expect("Failed to decompress");

    println!("{:?}", str::from_utf8(&decoded_bytes).expect("Bad UTF8"));

    let mut decoded_bytes = Vec::with_capacity(bufsize);
    decompressor
        .decompress_vec(&msg_vec2[..], &mut decoded_bytes, FlushDecompress::Finish)
        .expect("Failed to decompress");

    println!("{:?}", str::from_utf8(&decoded_bytes).expect("Bad UTF8"));
}

Cargo.toml must include the following:

flate2 = { version = "1", features = ["zlib-ng"], default-features = false }
@PierreV23
Copy link
Contributor

PierreV23 commented Aug 1, 2023

Before PR #361 you had to manually write a decompresser if you wanted to use a custom Decompress object (which is needed if you want to specify the header or window_bits values). Writing a custom decompresser would kind look like this: https://github.com/bend-n/mindus/blob/master/src/data/mod.rs#L190 (line 190 should point to a function called deflate).

After the PR of #361 decompressing (using read::ZlibDecoder) is as simple as:

let mut decompresser = ZlibDecoder::new_with_decompress(
    compressed,
    Decompress::new_with_window_bits(false, 15),
);
let mut decompressed = String::new();
decompresser.read_to_string(&mut decompressed)?;

Even tho this Issue is kinda old by now, I still hope it might help you or others that come across this.

@Byron Byron added the question label Aug 2, 2023
@Byron
Copy link
Member

Byron commented Aug 2, 2023

@EricFecteau Would this issue be fixed now that code like in the example above would work with a new release? It seems like it to me but I might be missing something. Thanks you.

@EricFecteau
Copy link
Author

The solution above works for the first msg_vec1 from the first message, but how do I add in msg_vec2? In the example in the first message, the decompressor is created separately and then decompress_vec can be called multiple times with new messages. msg_vec2 depends on msg_vec1, and therefore I can't simply call msg_vec2 the same way. How would I do this using ZlibDecoder::new_with_decompress?

@PierreV23
Copy link
Contributor

The solution above works for the first msg_vec1 from the first message, but how do I add in msg_vec2? In the example in the first message, the decompressor is created separately and then decompress_vec can be called multiple times with new messages. msg_vec2 depends on msg_vec1, and therefore I can't simply call msg_vec2 the same way. How would I do this using ZlibDecoder::new_with_decompress?

https://docs.rs/flate2/latest/flate2/read/struct.ZlibDecoder.html#method.reset
you can use ZlibDecoder::reset to reset the decoder and resupply with a new input stream.

@EricFecteau
Copy link
Author

reset completely resets the decoder and gives me corrupt deflate stream error, as I would expect, since msg_vec2 is dependent on msg_vec1. I might be missing something obvious, but in the original post, I can provide multiple vectors to the decompression object one after the other with (see msg_vec1 and msg_vec2):

    decompressor
        .decompress_vec(&msg_vec1[..], &mut decoded_bytes, FlushDecompress::Finish)
        .expect("Failed to decompress");

    println!("{:?}", str::from_utf8(&decoded_bytes).expect("Bad UTF8"));

    let mut decoded_bytes = Vec::with_capacity(bufsize);
    decompressor
        .decompress_vec(&msg_vec2[..], &mut decoded_bytes, FlushDecompress::Finish)
        .expect("Failed to decompress");

    println!("{:?}", str::from_utf8(&decoded_bytes).expect("Bad UTF8"));

With the code you provided, how do I provide a second compressed vector to the decompressor?

@PierreV23
Copy link
Contributor

Honestly, I don't really know either, I had assumed reset would work, but I didn't realise your 2nd vec was dependant on the 1st. Which I find quite odd by the way.

Here is something that would work, but is likely not ideal:

fn main() {
    // Python ZLIB compressed with options: zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -zlib.MAX_WBITS

    // b'{"msgs":[{"msg": "ping"}]}'
    let msg_vec1 = vec![
        170, 86, 202, 45, 78, 47, 86, 178, 138, 174, 6, 49, 148, 172, 20, 148, 10, 50, 243, 210,
        149, 106, 99, 107, 1, 0, 0, 0, 255, 255,
    ];

    // b'{"msgs":[{"msg": "lobby_clear"},{"msg": "lobby_complete"}]}'
    let msg_vec2 = vec![
        170, 198, 144, 201, 201, 79, 74, 170, 140, 79, 206, 73, 77, 44, 82, 170, 213, 65, 23, 206,
        207, 45, 200, 73, 45, 73, 5, 105, 5, 0,
    ];

    let mut msg_vec3 = msg_vec1.clone();
    msg_vec3.append(&mut msg_vec2.clone());

    let mut decompresser = ZlibDecoder::new_with_decompress(
        &msg_vec1[..],
        Decompress::new_with_window_bits(false, 15),
    );

    let mut msg1 = String::new();
    decompresser.read_to_string(&mut msg1).unwrap();
    println!("{}", msg1);

    decompresser = ZlibDecoder::new_with_decompress(
        &msg_vec3[..],
        Decompress::new_with_window_bits(false, 15),
    );

    let mut msg2 = String::new();
    decompresser.read_to_string(&mut msg2).unwrap();

    msg2 = msg2[msg1.len()..].to_string();

    println!("{}", msg2);
}

@PierreV23
Copy link
Contributor

PierreV23 commented Sep 13, 2023

Also I reccomend using python zlib's zlib.compress, its return can be passed as a single parameter to ZlibDecoder::new_with_decompress(...) instead of spreading your object(s) over two strings.

I took another peak at your python code, but you are supposed to make a new compressobj if you want to be able to be read seperately. (unless there are methods i am not aware of)

@EricFecteau
Copy link
Author

EricFecteau commented Sep 13, 2023

Thanks, but this would not work either. My python example is from a program I don't have access to (so I can't modify it), and it sends the data to me through a websocket (so I can't simply append it all together as I don't know what message msg_vec3 will be until I respond to the websocket based on the info in msg_vec1 and msg_vec2). All the messages I receive are dependent on the previous ones, even if they are not yet created at the time of decoding the previous ones.

Looking around the other issues, I suspect I have the same problem as #276 -- thankfully my first post does solve this, even if it's a bit clunkier than it could be, so I will close this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants