You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying without success to get sumy to work in Jupyter Notebook. But it is always throwing error for the Tokenizer.
Here is my Jupyter Notebook code:
!python -c "import nltk; nltk.download('stopwords')"
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
text = "Your long text here..."
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 3) # Summarize to 3 sentences
for sentence in summary:
print(sentence)
When I run this code I get the following error:
UnpicklingError Traceback (most recent call last)
Cell In[22], line 6
3 from sumy.summarizers.lsa import LsaSummarizer
5 text = "Your long text here..."
----> 6 parser = PlaintextParser.from_string(text, Tokenizer("english"))
7 summarizer = LsaSummarizer()
8 summary = summarizer(parser.document, 3) # Summarize to 3 sentences
File ~/Desktop/sample_project/env/lib/python3.10/site-packages/sumy/nlp/tokenizers.py:160, in Tokenizer.__init__(self, language)
157 self._language = language
159 tokenizer_language = self.LANGUAGE_ALIASES.get(language, language)
--> 160 self._sentence_tokenizer = self._get_sentence_tokenizer(tokenizer_language)
161 self._word_tokenizer = self._get_word_tokenizer(tokenizer_language)
File ~/Desktop/sample_project/env/lib/python3.10/site-packages/sumy/nlp/tokenizers.py:172, in Tokenizer._get_sentence_tokenizer(self, language)
170 try:
171 path = to_string("tokenizers/punkt/%s.pickle") % to_string(language)
--> 172 return nltk.data.load(path)
173 except (LookupError, zipfile.BadZipfile) as e:
174 raise LookupError(
175 "NLTK tokenizers are missing or the language is not supported.\n"
176 """Download them by following command: python -c "import nltk; nltk.download('punkt')"\n"""
177 "Original error was:\n" + str(e)
178 )
What can I do to fix this issue?
The text was updated successfully, but these errors were encountered:
I have been trying without success to get sumy to work in Jupyter Notebook. But it is always throwing error for the Tokenizer.
Here is my Jupyter Notebook code:
When I run this code I get the following error:
What can I do to fix this issue?
The text was updated successfully, but these errors were encountered: