Support masking of partial dialogue in multi-turn chat datasets #2207

jiatong-yu · 2024-12-25T17:54:15Z

According to the documentation, it appears that Torchtune currently supports training using either all “assistant” | “user” content or all of “assistant” content in a multi-turn conversation. However, a common use case is training on a specific subset of responses, such as only the most recent “assistant” responses in a conversation.

What is the recommended approach for achieving this with Torchtune?

felipemello1 · 2024-12-26T03:02:45Z

hey @jiatong-yu , you should be able to write your own custom message_transform / dataset.

Here is our wiki: https://pytorch.org/torchtune/main/basics/message_transforms.html

Take a look at how its done in the chat dataset: chatdataset.

torchtune/torchtune/datasets/_chat.py

Line 21 in aa8f365

train_on_input: bool = False,

Then, in your config, you can pass:

tune run <recipe> <config> --config dataset._component_:path.to.my.custom.dataset

calvinpelletier · 2024-12-26T19:06:08Z

Here's some additional info: #2111 (comment)

RdoubleA mentioned this issue Jan 1, 2025

More Chat Loss Masking Strategies #2214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support masking of partial dialogue in multi-turn chat datasets #2207

Support masking of partial dialogue in multi-turn chat datasets #2207

jiatong-yu commented Dec 25, 2024

felipemello1 commented Dec 26, 2024 •

edited by joecummings

Loading

calvinpelletier commented Dec 26, 2024

Support masking of partial dialogue in multi-turn chat datasets #2207

Support masking of partial dialogue in multi-turn chat datasets #2207

Comments

jiatong-yu commented Dec 25, 2024

felipemello1 commented Dec 26, 2024 • edited by joecummings Loading

calvinpelletier commented Dec 26, 2024

felipemello1 commented Dec 26, 2024 •

edited by joecummings

Loading