-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
finetune_version预测输出不可控的问题 #679
Comments
100 step loss多少了 |
没有太大关系,当然你batch越大鲁棒性繁华性越高。 |
现在这个loss应该能正常回答问题了,你是大概多少的相关数据呢,500-1000条吗 |
是的,数据量600
Yuxuan Zhang ***@***.***> 于2024年12月31日周二 21:31写道:
… 现在这个loss应该能正常回答问题了,你是大概多少的相关数据呢,500-1000条吗
—
Reply to this email directly, view it on GitHub
<#679 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI33YNM37TTKWI5SWBKHPJL2IKMCRAVCNFSM6AAAAABUFNHJZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRWGQ2TONRTHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
尝试进行更多轮训练,是否能实现相似效果,大概让loss降低到0.1,推理的时候保持贪婪采样 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
训练数据如下:
问题:图片中天气如何?
图片:图片路径
答案:不会下雨。
问题:图片中天气如何?
图片:图片路径
答案:会下雨。
答案只有会下雨和不会下雨。
训练样本约1000。
迭代训练5Kstep后,发现问:图片中天气如何?
答案:会下雨。但是大概20%的可能性会出现下述情况:
推理问题:图片中天气如何?
答案:图片中天气晴朗,没有乌云。。。。。描述了大量的通用文本,并没有输出我们想要的答案(不会下雨)。
请问问题原因可能是:训练不充分,数据量少,答案文本太短,需要多轮对话实现,微调参数设置lora?谢谢!
The text was updated successfully, but these errors were encountered: