Skip to content

A script that checks texts for similarity using the SBERT and NLTK libraries.

License

Notifications You must be signed in to change notification settings

merobi-hub/text-similarity-checker

Repository files navigation

Text similarity checker using SBERT and NLTK

This tool uses the SBERT SentenceTransformer and NLTK Punkt Sentence Tokenizer to compare two texts for similarity.

It outputs a sorted list of high-scoring sentence pairs with scores for each, the most similar pair, and the similarity index (average similarity).

To use

Install the required dependencies:

pip install -r requirements.txt

Run the text_similarity_checker script from the command line:

python3 text_similarity_checker.py

Note

By default, the script compares Bob Dylan's rather infamous Nobel lecture to its alleged source. Add your own texts for comparison as Python strings to the project's root directory. You will then need to modify text_similarity_checker slightly to use them. See the script for details.

Status

In development.

About

A script that checks texts for similarity using the SBERT and NLTK libraries.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages