Data

This folder contains the datasets used in the project. The datasets are structured as follows:

The dataset contains span-level annotations for errors in long-form answers as per our defined evaluation criteria:

Question misconception: False assumptions made within the given question.
Factuality: Accuracy and correctness of the answer as per verifiable facts.
Relevance: Specificity and meaningfulness of the answer.
Completeness: Answer comprehensiveness ensuring all question aspects are addressed.
References: (Un)helpful examples, analogies, and external references (websites or links) in the answer.

A subset of the dataset is shown below, where given a question and two possible answers (human and GPT-4), the {evaluation_criteria}_span column indicates the error spans in the answer for the respective evaluation criteria and the error justifications are given in {evaluation_criteria}_reason column.

This dataset consists of question-answer pairs with expert span-level annotations for completeness aspect, along with justifications.

A subset of the dataset is shown below, where the instruction column consists the task instruction, input column consists of the question-answer pair to evaluate, and the output column contains the sentence-level tag [Complete/ Incomplete] for the answer, along with justification for the incompleteness.

Note

This dataset is used to train the error-feedback model.

The preference dataset consists of a question with two possible answers: one from humans and the other from GPT-4. Expert annotators choose the better answer based on our defined evaluation criteria.

A subset of the dataset is shown below, where given a question, the preferred responses are present in the preferred_response column and the rejected responses are present in the rejected_response column.

Note

This dataset is used for DPO preference optimization of the refinement model.

The datasets are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data