-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data augmentation #141
Data augmentation #141
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #141 +/- ##
==========================================
+ Coverage 37.05% 37.91% +0.85%
==========================================
Files 20 20
Lines 1414 1440 +26
==========================================
+ Hits 524 546 +22
- Misses 890 894 +4 ☔ View full report in Codecov by Sentry. |
a1c0a7c
to
90da5cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff! I am happy with this, run well and I tried to commented out some variable as well. The only thing, I guess we need to add some guide or in the readme for this. As we never test this properly, so not necessarily, the default one in the config will give the best results. Some might harm the model more. But this will be cool for anyone to start doing ablation study on data augmentation.
aah good call! But I think having a guide on how to run a study like this could be helpful - I opened an issue. thanks Nik! |
* Move checkpoint type computation to utils * Refactor checkpointing in training script * Get ckpt type if ckpt is passed * optionally apply a data augmentation method (WIP) * fix config syntax in code * add data augmentation notebook * notebook to explore params of individual transformations * add transforms from config * Add keywords to datamodule params * Optionally skip data augmentation * If data augmentation key in config, apply * Update tests * Change tests to read default config * Refactor transform functions and clean up * update notebook * Fix data augmentation default config * Optionally log data augmentation transforms as artifacts * Rename skip to 'no_data_augmentation'
Rebase after #203 is merged
This PR adds a few data augmentation transforms that we think could be helpful.
Specifically,
--no_data_augmentation
to skip all data augmentation during training,--log_data_augmentation
to log the data augmentations linked to the datamodule as MLflow artefacts,