We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好, 想请教一下,就是每个task的指标怎么计算的在哪里看呀?有没有official的说明文档或者up-to-date的paper呢?
谢谢!
The text was updated successfully, but these errors were encountered:
Agent基准参考了OPEN基准,采用被测模型与代表性国际模型进行对战形式,计算胜率。 具体的,被测模型与3.5进行对战,计算胜(得3分)、平(得1分)、和(得0分)的成绩,算总成绩,并进行归一化。总之,这是相对于同一个基准模型的相对分数或成绩。
Sorry, something went wrong.
徐老师您好,请问胜、平、和的分数是人为打分的吗?我理解的是模型对战时,两个模型会针对问题进行回答,但哪个答案更优是如何判断的呢?
No branches or pull requests
您好,
想请教一下,就是每个task的指标怎么计算的在哪里看呀?有没有official的说明文档或者up-to-date的paper呢?
谢谢!
The text was updated successfully, but these errors were encountered: