Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama3 #1

Open
zf0x00 opened this issue May 14, 2024 · 3 comments
Open

Llama3 #1

zf0x00 opened this issue May 14, 2024 · 3 comments

Comments

@zf0x00
Copy link

zf0x00 commented May 14, 2024

Normal Llama 3 can work or need to train hypernetwork

@bminixhofer
Copy link
Owner

I am in the process of training a hypernetwork for Llama3!

@zf0x00
Copy link
Author

zf0x00 commented May 15, 2024

nice ❤️
also can share info about training how much time it takes and i tried to train but most notebook doesn't support python 3.11

@bminixhofer
Copy link
Owner

Here is the first version of a Llama3 hypernet: benjamin/zett-hypernetwork-Meta-Llama-3-8B-experimental.

It seems to underperform on Code though. I haven't yet found the reason why but will look into this later, so keeping this open.

Training took ~4 days on a TPUv4-32 pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants