-
Notifications
You must be signed in to change notification settings - Fork 37
Issues: microsoft/VPTQ
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Question about fine-tuning
alogrithm
enhancement
New feature or request
question
Further information is requested
#136
opened Dec 3, 2024 by
kimwin2
How to dequantize a model with 4 groups and centroids greater than 4096?
bug
Something isn't working
question
Further information is requested
#128
opened Nov 29, 2024 by
ShawnzzWu
How to Generate a 2-bit Quantized Meta-Llama-3.1-8B-Instruct Model?
question
Further information is requested
#126
opened Nov 21, 2024 by
ForAxel
Enhance the implementation of the CUDA inference kernel.
features
#118
opened Nov 13, 2024 by
haruhi55
Custom Model support
new models
require new models
question
Further information is requested
#113
opened Nov 7, 2024 by
huangtingwei9988
Add Ollama/llama.cpp/ggml support
inference
inference related questions
#77
opened Oct 20, 2024 by
YangWang92
Add VLM/Multimodality support
alogrithm
new models
require new models
#76
opened Oct 20, 2024 by
YangWang92
Does not work in Oobabooga
inference
inference related questions
question
Further information is requested
#53
opened Oct 8, 2024 by
Kaszebe
stdout captures and injects userwarnings into TextStreamer
question
Further information is requested
#42
opened Oct 5, 2024 by
JoeHelbing
ProTip!
Updated in the last three days: updated:>2024-12-22.