diff --git a/README.md b/README.md index 57d187f..3ebe2ff 100644 --- a/README.md +++ b/README.md @@ -14,42 +14,44 @@ The performance of SNCSE on STS task with different encoders is: To reproduct above results, please [download](https://pan.baidu.com/s/1fkvNRxu-ytbVbtxQhNF4Gw?pwd=9y7y) the files and unzip it to replace the original file folder. Then [download](https://pan.baidu.com/s/10KpCU2v_Wk36OxEBSdykiQ?pwd=0wot) the models, modify the file path variables and run: - +``` python bert_prediction.py python roberta_prediction.py - +``` To train SNCSE, please [download](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/blob/main/wiki1m_for_simcse.txt) the training file, and put it at /SNCSE/data. You can either run: - +``` python generate_soft_negative_samples.py - -to generate soft negative samples, or use our files in /Files/soft_negative_samples.txt. Then you may modify and run train_SNCSE.sh. +``` +to generate soft negative samples, or use our files in `/Files/soft_negative_samples.txt`. Then you may modify and run `train_SNCSE.sh`. To evalute the checkpoints saved during traing on the development set of STSB task, please run: - +``` python bert_evaluation.py python roberta_evaluation.py - +``` Feel free to contact the authors at wanghao2@sensetime.com for any questions. Please cite SNCSE as +``` { -Hao Wang, Yangguang Li, Zhen Huang, Yong Dou, Lingpeng Kong, Jing Shao. + Hao Wang, Yangguang Li, Zhen Huang, Yong Dou, Lingpeng Kong, Jing Shao. -SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples. + SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples. -CoRR, abs/2201.05979, 2022. + CoRR, abs/2201.05979, 2022. } +```