Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Purchase heatmap #352

Merged
merged 68 commits into from
Sep 30, 2023
Merged

Conversation

krestenlaust
Copy link
Member

@krestenlaust krestenlaust commented Sep 3, 2023

The goal of this PR is to add a heatmap over purchase activity, similar to Githubs activity heatmap, to userpages in Stregsystemet.

Plan:

  • Mix day-cell colors based on what products are purchased on a given day, red=beer, green=energy, blue=soda (example categories).
  • The brightness of a cell is proportional to money spent in relation to the day with most money spent.

Decisions to make:

  • Should there be a way to collapse the statistics (and store it on the user)? (to enable privacy for users not using multibuy)

  • Should cell color mixing be based on value sum or product count, in each category?
    Have multiple views/modes, one for mixing cells, one for value sum, and one for product count.

  • Maybe make brightness regard item count as opposed to money, to have more common brightness values as opposed to all kinds of percentages.
    Have multiple views/modes.

  • What page to put the statistics on. This will probably be dependent on how slow the queries are going to be, otherwise I'd prefer it be on a non-obscure page like usermenu (as opposed to statistics-page).
    Performance should be okay for it to be included directly in user menu.

  • Consider what to do with products outside of a category
    Simply disregard them, since it's only relevant to a single of the heatmap modes.

Resources:

TO-DO:

  • Replace mockup with generated html (Finished by da2d722)
  • Make functions for calculating color based on data (Finished by 8907d1f)
  • Query database for actual data instead of using mockup data (Finished by 485ff25)
  • Finish basic heatmap using database lookup (Finished by 2bb08be)
  • Finish abstraction of heatmap color modes (Finished by 8328017)
  • Improve queries (There's a bunch of similar queries to db) (Finished by 25e3beb. Selects related fields immediately)
  • Need a way to display what products have been bought on a particular day (Finished by d59af25)
  • Add money-sum heatmap color mode (Similar to general, but in relation to money used instead of product count) (Finished by aee0aed)
  • Implements tests (suggestions: empty user, correct date on data) (Finished by accee71)
  • Polish (make more interactive, and possibly interpolate colors instead of just using RGB values) (Finished by 155e227)
  • User evaluation & feedback (Did a few tests with different people, they were able to find the different modes of the heatmap, and were able to successfully navigate the heatmap in terms of dates and weeks. The category view was confusing to the user because the colors are off)
  • Rephrase/remove comments, and improve code (Finished by ad06c7e)
  • Category view doesn't mix colors in the expected way. Awaiting feedback from DVML student (Skipped)
  • (Maybe later?) Add admin settings to control defaults since categories need to be assigned somehow (Not finished :shipit:)

This PR closes #328.

Finished demonstration:
HeatmapDemo

@codecov-commenter
Copy link

codecov-commenter commented Sep 3, 2023

Codecov Report

Merging #352 (85805b1) into next (7b604a0) will increase coverage by 0.56%.
The diff coverage is 94.40%.

❗ Current head 85805b1 differs from pull request most recent head 6c8edc6. Consider uploading reports for the commit 6c8edc6 to get more accurate results

@@            Coverage Diff             @@
##             next     #352      +/-   ##
==========================================
+ Coverage   53.33%   53.90%   +0.56%     
==========================================
  Files          32       33       +1     
  Lines        4273     4508     +235     
  Branches      237      265      +28     
==========================================
+ Hits         2279     2430     +151     
- Misses       1943     2028      +85     
+ Partials       51       50       -1     
Files Coverage Δ
stregsystem/tests.py 76.81% <100.00%> (+0.97%) ⬆️
stregsystem/views.py 31.80% <100.00%> (-0.04%) ⬇️
stregsystem/purchase_heatmap.py 57.28% <92.68%> (ø)

@krestenlaust
Copy link
Member Author

The handwritten HTML has been replaced with generated HTML. Mockup data is randomly generated:

Screenshot of current layout

@krestenlaust
Copy link
Member Author

For some reason the product instances don't have any categories assigned 🤔
Debugger view

Even though they clearly are assigned:
Admin view

This is how I retrieve the product instances:
Code

@falkecarlsen have you got any idea why this is happening, or any debugging advice for how to move on ?

@falkecarlsen
Copy link
Member

You should look into https://docs.djangoproject.com/en/4.2/topics/db/queries/#lookups-that-span-relationships

On mobilepay-fixture I created a category which product pk=1 has, then the following filter works:
Product.objects.filter(categories__pk=1)

The Django ORM sometimes seems like it hallucinates completely unrelated predicates into filters, but that's just how it resolves SQL-join operations.

@falkecarlsen
Copy link
Member

I think it's a nice nod to our users that it emulates the Github activity graph, which starts on a Sunday. This also makes the weekday padded by Sat and Sun in top and bottom, and therefore centered.

I see the issue with week-numbers, but then again; do we need or want the week-numbers? Having months for reference seems a good tradeoff:
image

@krestenlaust krestenlaust marked this pull request as ready for review September 21, 2023 09:41
@krestenlaust
Copy link
Member Author

Focus points for review:

  • Making use of Q-objects in queries (?)
    I don't have any intuition in determining when to use Q-objects, so I've just been using regular queries.
  • Performance
    I have been considering performance while developing, but I can't deny that it takes ~300 ms extra to load user-menu page now. I'd really prefer the chart stays on user-menu page, because of my efforts in making it.
  • Category chart doesn't take proportions into consideration and looks a bit crude because of it.
    I'll admit. I couldn't figure out a way to make each color component scale according to proportion AND do so inversely, (lower proportions = lighter colors, like the other charts). It feels like a super simple problem, but I've been pulling my hair out for long enough now. A possible solution to the crude look would be to simply take an average of three (nicer) colors, and not including a color in the average if it isn't represented (e.g. beer and soda day = blue and red average).

If you're able to boot up a version (If you're TTS isn't too high), it would be great to make sure we're on the same page about everything! I've included a fixture. I've updated it since, so I'll try to reupload it.

@falkecarlsen I see the issue with week-numbers, but then again; do we need or want the week-numbers? Having months for reference seems a good tradeoff

For now I think I prefer weeks. They're still more precise than months (and they don't require me to make semi-large changes to the code😅)

:fritfit:

Copy link
Contributor

@jonasKjellerup jonasKjellerup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few minor suggestions. There doesn't seem to be all that much in the way low hanging fruit for performance improvements, but I wouldn't worry all that much about it.

With that said, you could consider trying to reduce the number of round-trips that you are doing to the database. I supsect that for your category lookups, the lookup time probably isn't much longer than the time to establish the connection and build the query. By changing get_category_objects to take a list of category names and doing a query akin to this:

select category_name, product_name
from Categories
inner join Products on Categories.id = Products.category
where category_name in ('category 1', 'category 2')
order by category_name

you could do the lookup in one query instead of 6. Not sure what this maps to in Django ORM land, but the general idea is just to reduce the number of lookups you are doing.

Additionally you can add an index on category_name and maybe save a milisecond or two on the lookup.


Edit: Also, to better convey the proportional magnitude for each day as whole, when in category view. You can reserve part of the color range and use that. Basically, instead of doing ratio * 255 you would do something like ratio * 200 + (magnitude * 55), where a magnitude closer to 1 makes lighter and a value closer to 0 darker.

stregsystem/purchase_heatmap.py Outdated Show resolved Hide resolved
stregsystem/purchase_heatmap.py Outdated Show resolved Hide resolved
stregsystem/templates/stregsystem/purchase_heatmap.html Outdated Show resolved Hide resolved
@krestenlaust
Copy link
Member Author

With that said, you could consider trying to reduce the number of round-trips that you are doing to the database. I supsect that for your category lookups, the lookup time probably isn't much longer than the time to establish the connection and build the query. By changing get_category_objects to take a list of category names and doing a query akin to this:

I've been looking into it, but I haven't been able to make it quite as good as your sql example. Unfortunately, I'm not that versed in Django ORM, and I keep finding the dynamics/behavior of their many-to-many fields (a feature categories are using) ... odd.

I've managed to reduce it from 6 queries to 3 queries, by combining individual category lookups with product.id selections.

I'll continue looking into the other points you mentioned

@krestenlaust
Copy link
Member Author

I did some performance improvements since last review, but I haven't been able to optimize it completely (which I'm fine with). I've also adjusted the category chart to a satisfactory state.

@krestenlaust
Copy link
Member Author

No pressure

@krestenlaust
Copy link
Member Author

krestenlaust commented Sep 30, 2023

@falkecarlsen can you indicate whether you intend on doing a review.

Edit: If you're able to do it within the next week if you intend on doing it

Copy link
Member

@falkecarlsen falkecarlsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :shipit:

@falkecarlsen falkecarlsen merged commit 46d63d6 into f-klubben:next Sep 30, 2023
4 checks passed
@krestenlaust krestenlaust deleted the purchase-heatmap branch February 29, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make github activity heatmap-like statistics for purchases
5 participants