Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream scalability issues #76

Open
rubensworks opened this issue Jul 20, 2021 · 12 comments
Open

Stream scalability issues #76

rubensworks opened this issue Jul 20, 2021 · 12 comments

Comments

@rubensworks
Copy link
Owner

rubensworks commented Jul 20, 2021

This is a master issue that acts as an overview of the following issues:

@rubensworks
Copy link
Owner Author

For reference, this issue will require switching to a new internal JSON parser.
My tests indicate that stream-json is probably the way to go: https://github.com/rubensworks/test-performance-json-parse-stream

@thadguidry
Copy link

@ohler55 has some nice performant JSON parsers in his repo's https://github.com/ohler55

@rubensworks
Copy link
Owner Author

@wouterbeek @LaurensRietveld Do you still want to go forward with this bounty? @Tpt would be available to take up this issue.

@wouterbeek
Copy link

wouterbeek commented Aug 16, 2022

@rubensworks Definitely; thanks @Tpt for looking into this!

@Tpt
Copy link
Contributor

Tpt commented Aug 16, 2022

As discussed with @rubensworks, I will work on this issue via the Comunica Association.

@Tpt
Copy link
Contributor

Tpt commented Aug 17, 2022

Sadly stream-json does not work out of the box with web browser: uhop/stream-json#91

So, we might have to pick an other parser.

@rubensworks
Copy link
Owner Author

rubensworks commented Aug 17, 2022

Sadly stream-json does not work out of the box with web browser: uhop/stream-json#91
So, we might have to pick an other parser.

Ah, that's too bad...

Any suggestions for alternatives?
Perhaps some web-compatible forks exist of that package? Or just some fully different implementations?

If no alternatives exist, we could propose a PR with making that lib web-compatible. I would guess the effort of that would be limited. (but then we'd first have to be certain that this lib really does everything we need)

@Tpt
Copy link
Contributor

Tpt commented Aug 17, 2022

Ah, that's too bad...

Any suggestions for alternatives?
Perhaps some web-compatible forks exist of that package? Or just some fully different implementations?

Thank you! I have opened #100 to discuss it.

@thadguidry
Copy link

Streaming is a memory and IO bound operation. It's likely the best approach will be using WASM + Go and as a base, utilizing Peter Ohler's OjG http://www.ohler.com/ You can email him and he's very responsive and can answer any questions. The work effort will be around compiling the Go code into a .wasm file that you can launch with wasm_exec.js See https://binx.io/2022/04/22/golang-webassembly/ Essentially, compile Peter's code as needed into a .wasm file to be executed by javascript as needed for the heavy lifting of stream parsing JSON.

@rubensworks
Copy link
Owner Author

Thanks for the input @thadguidry.
While I think WASM is a very interesting direction, it would increase the scope of this issue quite significantly, and we don't have the bandwidth (and budget) to take this up unfortunately.

@thadguidry
Copy link

Ah, you want to stay in Javascript, no worries.

@rubensworks
Copy link
Owner Author

Unfortunately @Tpt was not able to continue working on this bounty anymore, so this bounty will be unlocked again for other developers.

Current status on this bounty is that a new library has been created to parse JSON in a streaming manner (memory-safe): https://github.com/comunica/json-event-parser.js

The next step in this bounty is to plug it into this parser, and handle parsing in a spec-compliant manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants