You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that there's an evaluation script for perplexity. I wonder how to replicate the results in tables 1 and 2.
Are there any instructions for this?
The text was updated successfully, but these errors were encountered:
i think just run CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python examples/eval_long_ppl.py --model_name_or_path lmsys/vicuna-13b-v1.3 should work.
but i still met some problems when running, it produces RuntimeError: stack expects a non-empty TensorList for ppl = torch.exp(torch.stack(nlls).mean()) in eval_long_ppl.py. it seems to have some bugs still in the impl.
hope that helps and I also look forward to the eval code
Hi, thanks for your amazing work!
I noticed that there's an evaluation script for perplexity. I wonder how to replicate the results in tables 1 and 2.
Are there any instructions for this?
The text was updated successfully, but these errors were encountered: