We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I noticed that \\p{Emoji_Presentation} is not matching emoji 💪️in a test because of the the variant selector "\ufe0f".
\\p{Emoji_Presentation}
require(quanteda.textstats) #> Loading required package: quanteda.textstats require(quanteda) #> Loading required package: quanteda #> Package version: 4.0.0 #> Unicode version: 14.0 #> ICU version: 70.1 #> Parallel computing: 4 of 4 threads used. #> See https://quanteda.io for tutorials and examples. txt <- "£ € 👏 Rock on❗ 💪️🎸" toks <- tokens(txt) toks #> Tokens consisting of 1 document. #> text1 : #> [1] "£" "€" "👏" "Rock" "on" "❗" "💪️" "🎸" tokens_select(toks, "^\\p{Emoji_Presentation}+$", valuetype = "regex") #> Tokens consisting of 1 document. #> text1 : #> [1] "👏" "❗" "🎸" tokens_select(toks, "\\p{Emoji_Presentation}", valuetype = "regex") #> Tokens consisting of 1 document. #> text1 : #> [1] "👏" "❗" "💪️" "🎸" stringi::stri_extract_all_regex(txt, "\\p{Emoji_Presentation}") #> [[1]] #> [1] "👏" "❗" "💪" "🎸" stringi::stri_escape_unicode(types(toks)) #> [1] "\\u00a3" "\\u20ac" "\\U0001f44f" #> [4] "Rock" "on" "\\u2757" #> [7] "\\U0001f4aa\\ufe0f" "\\U0001f3b8"
Created on 2023-10-18 with reprex v2.0.2
The text was updated successfully, but these errors were encountered:
Temporary workardound for #64
58a1f03
No branches or pull requests
I noticed that
\\p{Emoji_Presentation}
is not matching emoji 💪️in a test because of the the variant selector "\ufe0f".Created on 2023-10-18 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: