-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More Chat Loss Masking Strategies #2214
Comments
If you are using a custom dataset with a custom message transform, you can manually mask the messages you need to in the transform by setting the |
I just saw a similar request in #2207, so this might be worth enabling |
Nice to "see" you again Rafi! Thanks for the quick response.
Masking the last turn only is a very (most?) common masking strategy so could be a nice feature to provide users out of the box.
Any pointers / examples for how to do this? |
Glad to see you on the torchtune repo Eugen :) Yes, see this page for an example. If your conversation is stored in a column, you can just query that column in the custom message transform and manually create Let me know if there's any confusion on this. |
Are there plans to add more loss masking strategies for chat data?
E.g. a very common loss masking strategy for multi-turn conversations is to mask everything but the last assistant response. However,
train_on_input=False
right now will compute the loss on all assistant turns, not just the last one. Is it possible to add this feature to torchtune?The text was updated successfully, but these errors were encountered: