The ArGPT dataset contains a set of argumentative essays generated using ChatGPT 3.5 and annotated for:
- Argument Mining: as defined by three different sub-tasks, i.e. Span Detection, Component Classification and Relation Classification;
- Automatic Essay Scoring: using real-world argumentative essay's correction criteria;
- Argument Quality: defined as good if the essay defends a true claim using a sound argumentation, bad if the argumentation is flawed or ugly if the argumentation is sound, but the claim it justifies is false. The evaluation criteria were:
- criteria_0: Clearly states a major claim;
- criteria_1: Introduces the theme;
- criteria_2: Develops the arguments throughout the text;
- criteria_3: Recapitulates the arguments in the conclusion;
- criteria_4: Adherence to standard language norms;
- criteria_5: Correct use of argumentative connectives;
- criteria_6: Adherence to the theme;
- criteria_7: No repetition of arguments;
- criteria_8: No contradictions;
- criteria_9: No beating around the bush;
- criteria_10: States true or plausible arguments.
Contains 168 texts annotated by a single annotator
Contains 172 texts with the resulting annotation being the consensus of two different annotators.
If you are using this dataset for research purposes, please cite the following paper: