This tool uses the SBERT SentenceTransformer and NLTK Punkt Sentence Tokenizer to compare two texts for similarity.
It outputs a sorted list of high-scoring sentence pairs with scores for each, the most similar pair, and the similarity index (average similarity).
Install the required dependencies:
pip install -r requirements.txt
Run the text_similarity_checker
script from the command line:
python3 text_similarity_checker.py
Note
By default, the script compares Bob Dylan's rather infamous Nobel lecture to its alleged source. Add your own texts for comparison as Python strings to the project's root directory. You will then need to modify text_similarity_checker
slightly to use them. See the script for details.
In development.