Question about the model and running the code #48

ccoverflow · 2024-12-24T11:38:26Z

Thank you for your great work on this project. I have been reading your paper and trying to reproduce the results using the provided code. However, I have encountered a few questions:

In the paper, it is mentioned that the Mixtral 8x7B model was used. However, in the config.ini file provided in the code, the specified model is Mixtral 8x22B. Could you clarify which model should be used for reproducing the results? Was this a change after the paper publication or an updated configuration?

Do I need to download all the specified files from the Hugging Face repository before running the code? Are there any additional steps I need to take to ensure the code runs smoothly?

I appreciate your time and help in clarifying these points. Thank you!

Vamsi-Kommineni · 2024-12-25T16:43:16Z

@ccoverflow
There are two versions of the code available.

First Version: This version is based on the methodology we initially published in our arXiv paper. The code in this version uses the Mixtral 8*7B model.
Second Version: After adapting our methodology, we presented the updated approach at a workshop, which is reflected in this workshop paper. In the second version of the code, we incorporated four models, including the Mixtral 8*22B model.

You can find the respective code versions here: First version, Second version

When you run the code for the first time, Hugging Face will automatically download the required model files in the background, so you won’t need to handle the downloads manually.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the model and running the code #48

Question about the model and running the code #48

ccoverflow commented Dec 24, 2024

Vamsi-Kommineni commented Dec 25, 2024

Question about the model and running the code #48

Question about the model and running the code #48

Comments

ccoverflow commented Dec 24, 2024

Vamsi-Kommineni commented Dec 25, 2024