Skip to content

Commit

Permalink
Update samples/python/multinomial_causal_lm/README.md
Browse files Browse the repository at this point in the history
Co-authored-by: Zlobin Vladimir <[email protected]>
  • Loading branch information
pavel-esir and Wovchena authored Aug 5, 2024
1 parent c789717 commit 1ae9083
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion samples/python/multinomial_causal_lm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ See https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md#

## Streaming

This Python example demonstrates custom detokenization with bufferization. The streamer receives integer tokens corresponding to each word or subword, one by one. If tokens are decoded individually, because of detokenize(tokenize(" a")) == "a" the resulting text will miss necessary spaces.
This Python example demonstrates custom detokenization with bufferization. The streamer receives integer tokens corresponding to each word or subword, one by one. If tokens are decoded individually, the resulting text misses necessary spaces because of detokenize(tokenize(" a")) == "a".

To address this, the detokenizer needs a larger context. We accumulate tokens in a tokens_cache buffer and decode multiple tokens together, adding the text to the streaming queue only when a complete decoded chunk is ready. We run a separate thread to print all new elements arriving in this queue from the generation pipeline. Each generated chunk of text is put into a synchronized queue, ensuring that all put and get operations are thread-safe and blocked until they can proceed.

Expand Down

0 comments on commit 1ae9083

Please sign in to comment.