Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming exports #1828

Closed
wants to merge 14 commits into from
Closed

Streaming exports #1828

wants to merge 14 commits into from

Conversation

ballPointPenguin
Copy link
Contributor

I reverted this PR and re-opened it.
Somehow these changes caused the heroku deploy to be unable to connect to postgres.

samskivert and others added 14 commits September 19, 2024 13:27
And add a "streaming" API for making database queries, which streams the
results from the database to Node as they are generated by Postgres.

This allows Node to process the rows one by one (and garbage collect in
between), which is much easier on the VM when we need to do big queries that
summarize data (or just format it and incrementally spit it out an HTTP
response).
This moves the handle_GET_reportExport route into its own file, which
necessitated refactoring some other things (zinvite and pca) out of server.ts
as well. Chipping away at the monolith.

This also converts the votes.csv report to use the streaming query from
Postgres, which is mostly a smoke test. It seems to work, so next I'll convert
it to stream the results incrementally to the HTTP response as well.
There was actually a bug in the old SQL that aggregated votes from _all_
conversations instead of just the conversation in question, which is why it
took 30 seconds to run. With that bug fixed, even the super slow "do a full
subquery for each comment row" was actually quite fast. But this is way
cheaper/faster.
And add a "streaming" API for making database queries, which streams the
results from the database to Node as they are generated by Postgres.

This allows Node to process the rows one by one (and garbage collect in
between), which is much easier on the VM when we need to do big queries that
summarize data (or just format it and incrementally spit it out an HTTP
response).
This moves the handle_GET_reportExport route into its own file, which
necessitated refactoring some other things (zinvite and pca) out of server.ts
as well. Chipping away at the monolith.

This also converts the votes.csv report to use the streaming query from
Postgres, which is mostly a smoke test. It seems to work, so next I'll convert
it to stream the results incrementally to the HTTP response as well.
There was actually a bug in the old SQL that aggregated votes from _all_
conversations instead of just the conversation in question, which is why it
took 30 seconds to run. With that bug fixed, even the super slow "do a full
subquery for each comment row" was actually quite fast. But this is way
cheaper/faster.
In the raw votes table, -1 means agree and 1 means disagree, so we need to
count things correctly. And when exporting votes in participant votes, we flip
the sign so that 1 means agree and -1 means disagree.
@ballPointPenguin
Copy link
Contributor Author

ballPointPenguin commented Oct 24, 2024

from heroku log:

2024-10-24T03:51:12.271040+00:00 app[web.1]: �[31merror�[39m: pg_connect_pool_fail no pg_hba.conf entry for host "3.80.159.156", user "[REDACTED]", database "[REDACTED]", no encryption {"code":"28000","file":"auth.c","length":172,"line":"559","name":"error","routine":"ClientAuthentication","service":"polis-api-server","severity":"FATAL","stack":"error: no pg_hba.conf entry for host "3.80.159.156", user "[REDACTED]", database "[REDACTED]", no encryption\n at Parser.parseErrorMessage (/app/node_modules/pg-protocol/src/parser.ts:369:69)\n at Parser.handlePacket (/app/node_modules/pg-protocol/src/parser.ts:188:21)\n at Parser.parse (/app/node_modules/pg-protocol/src/parser.ts:103:30)\n at Socket. (/app/node_modules/pg-protocol/src/index.ts:7:48)\n at Socket.emit (node:events:517:28)\n at Socket.emit (node:domain:489:12)\n at addChunk (node:internal/streams/readable:368:12)\n at readableAddChunk (node:internal/streams/readable:341:9)\n at Socket.Readable.push (node:internal/streams/readable:278:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)\n at TCP.callbackTrampoline (node:internal/async_hooks:128:17)","timestamp":"2024-10-24 03:51:12 +00:00"}

@ballPointPenguin
Copy link
Contributor Author

Closing in favor of #1829

@ballPointPenguin ballPointPenguin deleted the streaming-exports branch November 15, 2024 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants