GGUF #3

maxim-saplin · 2024-07-26T07:59:17Z

Any chance 2bit models can be used with llama.cpp? Would be great to get LLama 3.1 (8B and 70B) converted to GGUF to try them out locally.

Thanks for the great research work!

ChenMnZ · 2024-07-27T13:24:58Z

Hi, thanks for your interesting for our work.

Unfortunately, I found that llama.cpp don't support for GPTQ format quantization type now (see ggerganov/llama.cpp#4165 for details).

Therefore, it is not an easy things to converte our 2-bit model into GGUF.

kaleid-liner · 2024-08-11T11:09:37Z

Any chance 2bit models can be used with llama.cpp? Would be great to get LLama 3.1 (8B and 70B) converted to GGUF to try them out locally.

Thanks for the great research work!

T-MAC has supported GPTQ format through llama.cpp GGUF integrated with its own highly optimized kernels, and already tested with Llama-3-8b-instruct-w4-g128/Llama-3-8b-instruct-w2-g128 from EfficientQAT. You can try it.

ChenMnZ · 2024-08-11T11:40:21Z

Thanks for your reminder, I will give a try.

ChenMnZ · 2024-08-11T12:22:34Z

@kaleid-liner Does T-MAC support w2g64. I have uploaded a w2g64 Mistral-Large-Instruct to huggingface, which is hot on Reddit.

I think it would be interesting if T-MAC also support for w2g64.

kaleid-liner · 2024-08-11T12:45:46Z

Sure. T-MAC supports any group size by setting --group_size. But I'm not sure if the convert script supports Mistral. I need to test it.

brownplayer · 2024-08-20T10:25:27Z

确定。T-MAC通过设置来支持任何组大小。但我不确定转换脚本是否支持 Mistral。我需要测试它。--group_size

hi, how is the test going? Does it support mistral

kaleid-liner · 2024-08-20T11:17:58Z

@ChenMnZ @brownplayer Sure. It supports Mistral.

brownplayer · 2024-08-20T11:22:42Z

Ok, thank you for your reply. May I ask what command is used to wake up the first time the model is downloaded? I'm using a GPTQ format model

…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF #3

GGUF #3

maxim-saplin commented Jul 26, 2024 •

edited

Loading

ChenMnZ commented Jul 27, 2024

kaleid-liner commented Aug 11, 2024 •

edited

Loading

ChenMnZ commented Aug 11, 2024

ChenMnZ commented Aug 11, 2024

kaleid-liner commented Aug 11, 2024

brownplayer commented Aug 20, 2024

kaleid-liner commented Aug 20, 2024

brownplayer commented Aug 20, 2024 via email

GGUF #3

GGUF #3

Comments

maxim-saplin commented Jul 26, 2024 • edited Loading

ChenMnZ commented Jul 27, 2024

kaleid-liner commented Aug 11, 2024 • edited Loading

ChenMnZ commented Aug 11, 2024

ChenMnZ commented Aug 11, 2024

kaleid-liner commented Aug 11, 2024

brownplayer commented Aug 20, 2024

kaleid-liner commented Aug 20, 2024

brownplayer commented Aug 20, 2024 via email

maxim-saplin commented Jul 26, 2024 •

edited

Loading

kaleid-liner commented Aug 11, 2024 •

edited

Loading