[docs] add a comment that offloading requires CUDA GPU (huggingface#3…

…5055) * add commen to offloading * Update docs/source/en/kv_cache.md Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]>
AnswerDotAI · Dec 4, 2024 · baa3b22 · baa3b22
1 parent 1da1e0d
commit baa3b22
Showing 1 changed file with 2 additions and 1 deletion.
diff --git a/docs/source/en/kv_cache.md b/docs/source/en/kv_cache.md
@@ -180,7 +180,7 @@ Fun fact: The shortest war in history was between Britain and Zanzibar on August
 
 <Tip warning={true}>
 
-Cache offloading requires a GPU and can be slower than dynamic KV cache. Use it if you are getting CUDA out of memory errors.
+Cache offloading requires a CUDA GPU and can be slower than dynamic KV cache. Use it if you are getting CUDA out of memory errors.
 
 </Tip>
 
@@ -261,6 +261,7 @@ This will use the [`~OffloadedStaticCache`] implementation instead.
 >>> tokenizer.batch_decode(out, skip_special_tokens=True)[0]
 "Hello, my name is [Your Name], and I am a [Your Profession] with [Number of Years] of"
 ```
+Cache offloading requires a CUDA GPU.
 
 
 ### Sliding Window Cache