Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using lunr.zh.js 'nodejieba.cut is not a function' #91

Open
RogerBlasco opened this issue Nov 10, 2022 · 2 comments
Open

Error when using lunr.zh.js 'nodejieba.cut is not a function' #91

RogerBlasco opened this issue Nov 10, 2022 · 2 comments

Comments

@RogerBlasco
Copy link

RogerBlasco commented Nov 10, 2022

Error is:
Uncaught TypeError: nodejieba.cut is not a function
at lunr.zh.tokenizer (lunr.zh.js:98:1)
at lunr.Builder.add (lunr.js:2479:1)
at lunr.Builder. (xxx)
at lunr (lunr.js:53:1)
at XMLHttpRequest. (xxx)

The line in the lunr.zh.tokenizer is:

nodejieba.cut(str, true).forEach(function(seg) {
        tokens = tokens.concat(seg.split(' '))
      })

I'm afraid I'm not quite good enough at this time to dive in and resolve, but if someone could assist in reviewing or letting me know what exactly I would need to do to handle, I would much appreciate...

@knubie
Copy link

knubie commented Feb 23, 2023

Are you trying to run this in a browser environment? The zh tokenizer requires node to run, because it uses C++ addons (node jieba). I opened an issue (#90) where I talk about how you can use the built-in Intl.Segmenter instead to segment Chinese (and other) languages quite easily. Here is a fork where I switched the zh module to using Intl.Segmenter.

@greylantern
Copy link

@knubie this is awesome and solved the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants