You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first, there are a lot of old/literary conjugations of the auxiliary verbs. it's a lot of computation for words rarely used in modern french. but the problem is really that some words are wrong. été is the past participle of être alright, but it's also the noun summer, so you probably don't want it as a stopword. été as past participle is invariable so the words étée and étées do not exist and étés exists only as the plural of summer. it's almost the same for the present participle étant: invariable but used as an adjective and a noun in philosophy, so either the word does not exist or you don't want to delete it. as and fut are nouns too. edit: forgot some other polysemic entries: suis, est, sommes and avions
The text was updated successfully, but these errors were encountered:
NLTK's stopwords lists come from the Snowball project, but someone added aberrant forms like "ayantes" to the French list. An easy solution could be to just go back to the original list.
A definitive list is not likely, because the criteria vary according to the purpose of the analysis: sometimes you don't want to entirely discard "to be or not to be".
When asked if a definitive list can ever exist, it explains that even though they may not be definitive, these lists serve as a practial tool, and that they often need to be adapted for their purpose: fr-stopwords-4o-exist.txt
first, there are a lot of old/literary conjugations of the auxiliary verbs. it's a lot of computation for words rarely used in modern french. but the problem is really that some words are wrong. été is the past participle of être alright, but it's also the noun summer, so you probably don't want it as a stopword. été as past participle is invariable so the words étée and étées do not exist and étés exists only as the plural of summer. it's almost the same for the present participle étant: invariable but used as an adjective and a noun in philosophy, so either the word does not exist or you don't want to delete it. as and fut are nouns too. edit: forgot some other polysemic entries: suis, est, sommes and avions
The text was updated successfully, but these errors were encountered: