-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideal database #13
Comments
IMO, let's just start building as memory resident then if we find a compelling reason to move to a database let's drive that decision of which to choose off of the particular challenge that comes up |
It cannot be memory resident, the tracked bills must be persistent, the Custom JSON option is the best for the medium term. We also consider building a flag into code to limit it to one request per server, this way we resolver concurrency issues, and also reduce the load on the BOT. |
Sorry, I really just meant the comparisons being done against an object held in memory. I didn't mean to not dump that to a .json file. So, the way I was suggesting is: [new data] -> add to memory -> dump memory to .json file. on initial load -> load .json file to memory |
Oh sorry I must had misread it, but still I was clearing up doubts about the other options. We will test how it performs, we don't expect this bot to be very resource heavy, but if you look at the code this is basically how it is being done right now. Loading the tracked bills from the file, checking/updating their status from the API, updating the file on disk, and then messaging the channels if there are updates. |
oh, i see... sorry i have not had a chance to get up to speed on the way it's working now. are there any challenges with the current implementation? any reason you might expect this to need to be changed soon? |
The first change we foresaw on this approach was concurrency, if like two request to update the file came at once, the resulting file would be written incorrectly. With that problem in mind we first thought about requesting a lock for the file manipulation, but then we have an even better idea, to limit the requests to 1 per server. That way we keep the bot load down, and resolve the concurrency issues. And given the scope of the project, there won't be a need for any database, because we are not looking for storing information and all the status changes from the bills. |
I see what you mean about the concurrency problem... but that's actually an issue that extends beyond merely writing to file - that issue exists before the data touches the physical storage layer. said differently, if you have a concurrency problem when writing you probably have a concurrency bug in your code. Probably the easiest way to address this is by implementing a simple FIFO queue such as this https://stackoverflow.com/questions/1590247/how-do-you-implement-a-stack-and-a-queue-in-javascript Sorry I'm commenting before actually analyzing the code, i just felt that was an important thing to point out since it may impact your current architecture. |
Why is a DB layer not viable at this point? I am not sure I understand that part of the equation and if I understood why it isn't viable then I would better understand the problem. |
There is something nice about the entire database/config being stored in a human-readable JSON file. |
As much as I love/prefer working with traditional SQL, the only benefit I see to using it would be the aforementioned solution to concurrency problems. Not all concurrency issues are bugs; if they were, enterprise storage arrays wouldn't exist. Anyway, there is very little opportunity to de-duplicate fields within the JSON file. We would get some savings from the states (breaking that into its own table; I mean MA, TX, TN, etc), and a lot of savings from not having the same keywords 1,000 times over and over. But we could get around that problem far more easily by just passing the resulting JSON file through LZMA before we write it to disk and we'd get nearly the same space savings as we'd get from a normal database. If we don't already, it would be a good idea to keep a separate database file (JSON or otherwise) of timers for when to refresh API objects. The API doesn't provide a cache timer within the JSON result (or at least I didn't see it when I checked) so we would need to write down our own 'cached until' fields so we'd know when to re-check for data. |
I would recommend LokiJS because it saves the data as regular JSON files, is synchronous, supports "collections", lightweight and easy to use. |
I get that it being human readable is nice, but I have been told it is not viable. Document DB's are just as readable as JSON files, so that doesn't seem to be a valid reason to stay away from them. And double-clicking a file vs double-clicking a collection does seem to add complexity either. So if I understood the reason that DB's are not a viable solution I think I could help out with coming up with a solution. |
Please discuss here.
The text was updated successfully, but these errors were encountered: