To load datasets for evaluation with script get_data.py
, some datasets need to be downloaded first. In the data/sts_str
folder, you can run the following commands:
git clone https://gitlab.com/tigi.cz/cross-lingual-sts.git
The datasets should now be in data/sts_str/cross-lingual-sts
.
git clone https://github.com/lksenel/Kardes-NLU.git
The datasets should now be in data/sts_str/Kardes-NLU
.
git clone https://github.com/semantic-textual-relatedness/Semantic_Relatedness_SemEval2024.git
The datasets should now be in data/sts_str/Semantic_Relatedness_SemEval2024
.