You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work on this project. I have been reading your paper and trying to reproduce the results using the provided code. However, I have encountered a few questions:
In the paper, it is mentioned that the Mixtral 8x7B model was used. However, in the config.ini file provided in the code, the specified model is Mixtral 8x22B. Could you clarify which model should be used for reproducing the results? Was this a change after the paper publication or an updated configuration?
Do I need to download all the specified files from the Hugging Face repository before running the code? Are there any additional steps I need to take to ensure the code runs smoothly?
I appreciate your time and help in clarifying these points. Thank you!
The text was updated successfully, but these errors were encountered:
@ccoverflow
There are two versions of the code available.
First Version: This version is based on the methodology we initially published in our arXiv paper. The code in this version uses the Mixtral 8*7B model.
Second Version: After adapting our methodology, we presented the updated approach at a workshop, which is reflected in this workshop paper. In the second version of the code, we incorporated four models, including the Mixtral 8*22B model.
When you run the code for the first time, Hugging Face will automatically download the required model files in the background, so you won’t need to handle the downloads manually.
Thank you for your great work on this project. I have been reading your paper and trying to reproduce the results using the provided code. However, I have encountered a few questions:
In the paper, it is mentioned that the Mixtral 8x7B model was used. However, in the config.ini file provided in the code, the specified model is Mixtral 8x22B. Could you clarify which model should be used for reproducing the results? Was this a change after the paper publication or an updated configuration?
Do I need to download all the specified files from the Hugging Face repository before running the code? Are there any additional steps I need to take to ensure the code runs smoothly?
I appreciate your time and help in clarifying these points. Thank you!
The text was updated successfully, but these errors were encountered: