Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGUF #3

Open
maxim-saplin opened this issue Jul 26, 2024 · 8 comments
Open

GGUF #3

maxim-saplin opened this issue Jul 26, 2024 · 8 comments

Comments

@maxim-saplin
Copy link

maxim-saplin commented Jul 26, 2024

Any chance 2bit models can be used with llama.cpp? Would be great to get LLama 3.1 (8B and 70B) converted to GGUF to try them out locally.

Thanks for the great research work!

@ChenMnZ
Copy link
Collaborator

ChenMnZ commented Jul 27, 2024

Hi, thanks for your interesting for our work.

Unfortunately, I found that llama.cpp don't support for GPTQ format quantization type now (see ggerganov/llama.cpp#4165 for details).

Therefore, it is not an easy things to converte our 2-bit model into GGUF.

@kaleid-liner
Copy link

kaleid-liner commented Aug 11, 2024

Any chance 2bit models can be used with llama.cpp? Would be great to get LLama 3.1 (8B and 70B) converted to GGUF to try them out locally.

Thanks for the great research work!

T-MAC has supported GPTQ format through llama.cpp GGUF integrated with its own highly optimized kernels, and already tested with Llama-3-8b-instruct-w4-g128/Llama-3-8b-instruct-w2-g128 from EfficientQAT. You can try it.

@ChenMnZ
Copy link
Collaborator

ChenMnZ commented Aug 11, 2024

Thanks for your reminder, I will give a try.

@ChenMnZ
Copy link
Collaborator

ChenMnZ commented Aug 11, 2024

@kaleid-liner Does T-MAC support w2g64. I have uploaded a w2g64 Mistral-Large-Instruct to huggingface, which is hot on Reddit.

I think it would be interesting if T-MAC also support for w2g64.

@kaleid-liner
Copy link

Sure. T-MAC supports any group size by setting --group_size. But I'm not sure if the convert script supports Mistral. I need to test it.

@brownplayer
Copy link

确定。T-MAC通过设置来支持任何组大小。但我不确定转换脚本是否支持 Mistral。我需要测试它。--group_size

hi, how is the test going? Does it support mistral

@kaleid-liner
Copy link

@ChenMnZ @brownplayer Sure. It supports Mistral.

@brownplayer
Copy link

brownplayer commented Aug 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants