Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support in pipelines for Unsupervised models for which target_in_fit is true #984

Merged
merged 4 commits into from
Jul 2, 2024

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Jun 27, 2024

This PR resolves JuliaAI/MLJ.jl#1134.

An Unsupervised model that wants to buy into the new functionality must overload MLJModelInterface.target_in_fit to hold true. Here's an example from tests added in this PR.

@EssamWisam Can you please confirm this resolves your issue with target encoding?

I have tested this PR with a local run of MLJ's integration tests.

@ablaom
Copy link
Member Author

ablaom commented Jun 27, 2024

@EssamWisam A review would also be nice, if you have time to study the Pipeline code base enough to do that.

@EssamWisam
Copy link

I haven't studied the Pipeline code but looked closely at the changes (which are not that significant) and was able to make some sense out of them. Likewise, the tests seems to intuitive enough for me. I think this is on point.

@ablaom
Copy link
Member Author

ablaom commented Jun 30, 2024

@EssamWisam Can you please confirm this PR resolves your issue with target encoding?

@EssamWisam
Copy link

@ablaom I tried to clone this branch of MLJBase and to dev it and MLJTransforms (after modifying it as needed) to further confirm but that was nontrivial for me:

Precompiling MLJTransforms
        Info Given MLJTransforms was explicitly requested, output will be shown live 
WARNING: Method definition target_in_fit(Type) in module StatisticalTraits at [/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149) overwritten in module MLJTransforms at [/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118).
ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation.
  ? MLJTransforms
[ Info: Precompiling MLJTransforms [23777cdb-d90c-4eb0-a694-7c2b83d5c1d6]
WARNING: Method definition target_in_fit(Type) in module StatisticalTraits at [/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149) overwritten in module MLJTransforms at [/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118).
ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation.
[ Info: Skipping precompilation since __precompile__(false). Importing MLJTransforms [23777cdb-d90c-4eb0-a694-7c2b83d5c1d6].

Thus, I can only say I believe this would solve the issue but can't be absolutely certain.

@ablaom
Copy link
Member Author

ablaom commented Jul 2, 2024

@EssamWisam To help me diagnose, what exactly do you have at line 118 of MLJTransforms/src/target_encoding/interface_mlj.jl ? This will be your overloading of target_in_fit, the source of the complaint.

@EssamWisam
Copy link

EssamWisam commented Jul 2, 2024

I passed an incorrect argument instead of MMI.target_in_fit(::Type{<:TargetEncoder}) = true after fixing it works as expected. That is, I can form a pipeline of target encoder and random forest in my tutorial.

Copy link

codecov bot commented Jul 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.21%. Comparing base (8cb6f26) to head (63344b7).
Report is 23 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #984      +/-   ##
==========================================
+ Coverage   88.13%   88.21%   +0.08%     
==========================================
  Files          28       28              
  Lines        2587     2588       +1     
==========================================
+ Hits         2280     2283       +3     
+ Misses        307      305       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ablaom ablaom merged commit b44e7cf into dev Jul 2, 2024
5 checks passed
@ablaom ablaom deleted the pipelines-with-supervised-transformers branch July 2, 2024 20:48
This was referenced Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add pipeline support for Unsupervised models that have a target in fit
2 participants