Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLA fixes and improvements #2

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

CLA fixes and improvements #2

wants to merge 7 commits into from

Conversation

KeremTurgutlu
Copy link

@KeremTurgutlu KeremTurgutlu commented Oct 25, 2024

  • Scalar tensors are not supported by FSDP, make k_scale and v_scale 1 dim tensor.
  • Add CLA test to ensure parameters are updated as expected .
  • Add cla_kv_detached as a new config param, which will by default detach() shared KV states.

@KeremTurgutlu KeremTurgutlu changed the title scalar not supported by FSDP, add CLA training test CLA fixes and improvements Oct 25, 2024
@austinvhuang austinvhuang self-assigned this Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants