Results for Optimizing OPT Model seems unusually high for 4 bits per tensor #2352

amitrana001 · 2024-10-01T21:30:50Z

Looking at the data for 4 bits per tensor in the results table for optimizing OPT model, the values for c4 Perplexity and wikitext2 Perplexity seem unusually high compared to the other configurations.
Are these values correct or there is some typo?

Code link here:

coremltools/docs-guides/source/opt-opt1_3.md

Line 74 in c3ea4cf

    
           | 4 bits per tensor                   | 0.66            | 18763             | 31087                    | 21.66                       |

The text was updated successfully, but these errors were encountered:

junpeiz · 2024-10-03T17:31:16Z

My understanding is that the 4-bit per-tensor hurts that model too much, so the perplexity exploded. By using more fine-grained quantization (per-channel, per-block), the perplexity gets back to normal.
(cc @aseemw to confirm it's not a typo)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results for Optimizing OPT Model seems unusually high for 4 bits per tensor #2352

Results for Optimizing OPT Model seems unusually high for 4 bits per tensor #2352

amitrana001 commented Oct 1, 2024

junpeiz commented Oct 3, 2024

Results for Optimizing OPT Model seems unusually high for 4 bits per tensor #2352

Results for Optimizing OPT Model seems unusually high for 4 bits per tensor #2352

Comments

amitrana001 commented Oct 1, 2024

junpeiz commented Oct 3, 2024