Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance assets processing reliability and logs #96

Merged
merged 1 commit into from
Dec 6, 2024
Merged

Conversation

benoit74
Copy link
Contributor

@benoit74 benoit74 commented Nov 26, 2024

Fix #84
Fix #89

Changes:

  • revisit all logs to display as much information as possible
  • enrich assets metadata (in memory) to be able to log where they are used
  • catch all exceptions arising while processing an asset, not only RequestsException (if there is a bad bug, the asset threshold is intended to catch it anyway)

@benoit74 benoit74 self-assigned this Nov 26, 2024
@benoit74 benoit74 marked this pull request as ready for review November 26, 2024 15:20
@benoit74 benoit74 requested a review from rgaudin November 26, 2024 15:20
Copy link

codecov bot commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 18.18182% with 27 lines in your changes missing coverage. Please review.

Project coverage is 49.20%. Comparing base (1399a4a) to head (de9e0ee).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
scraper/src/mindtouch2zim/processor.py 6.25% 15 Missing ⚠️
scraper/src/mindtouch2zim/asset.py 25.00% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #96      +/-   ##
==========================================
- Coverage   49.42%   49.20%   -0.23%     
==========================================
  Files          17       17              
  Lines        1050     1063      +13     
  Branches      147      149       +2     
==========================================
+ Hits          519      523       +4     
- Misses        520      529       +9     
  Partials       11       11              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No actual change request but I'd like you to explain it to me (live)

scraper/src/mindtouch2zim/processor.py Outdated Show resolved Hide resolved
scraper/src/mindtouch2zim/asset.py Outdated Show resolved Hide resolved
scraper/src/mindtouch2zim/asset.py Show resolved Hide resolved
@benoit74 benoit74 force-pushed the user_agent branch 2 times, most recently from 7b748cb to bc18cd0 Compare December 6, 2024 10:15
Base automatically changed from user_agent to main December 6, 2024 10:22
@benoit74 benoit74 force-pushed the assets_reliability branch 2 times, most recently from d935c41 to 67f0e73 Compare December 6, 2024 12:25
@benoit74 benoit74 merged commit 1ef9614 into main Dec 6, 2024
10 checks passed
@benoit74 benoit74 deleted the assets_reliability branch December 6, 2024 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignore bad images Display page where asset is used in debug mode
2 participants