You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking at the data for 4 bits per tensor in the results table for optimizing OPT model, the values for c4 Perplexity and wikitext2 Perplexity seem unusually high compared to the other configurations.
Are these values correct or there is some typo?
My understanding is that the 4-bit per-tensor hurts that model too much, so the perplexity exploded. By using more fine-grained quantization (per-channel, per-block), the perplexity gets back to normal.
(cc @aseemw to confirm it's not a typo)
Looking at the data for
4 bits per tensor
in the results table for optimizing OPT model, the values forc4 Perplexity
andwikitext2 Perplexity
seem unusually high compared to the other configurations.Are these values correct or there is some typo?
Code link here:
coremltools/docs-guides/source/opt-opt1_3.md
Line 74 in c3ea4cf
The text was updated successfully, but these errors were encountered: