Add support in pipelines for `Unsupervised` models for which `target_in_fit` is `true` #984

ablaom · 2024-06-27T06:15:02Z

An Unsupervised model that wants to buy into the new functionality must overload MLJModelInterface.target_in_fit to hold true. Here's an example from tests added in this PR.

@EssamWisam Can you please confirm this resolves your issue with target encoding?

I have tested this PR with a local run of MLJ's integration tests.

ablaom · 2024-06-27T06:41:11Z

@EssamWisam A review would also be nice, if you have time to study the Pipeline code base enough to do that.

EssamWisam · 2024-06-28T15:59:55Z

I haven't studied the Pipeline code but looked closely at the changes (which are not that significant) and was able to make some sense out of them. Likewise, the tests seems to intuitive enough for me. I think this is on point.

ablaom · 2024-06-30T20:41:04Z

@EssamWisam Can you please confirm this PR resolves your issue with target encoding?

EssamWisam · 2024-07-02T01:00:53Z

@ablaom I tried to clone this branch of MLJBase and to dev it and MLJTransforms (after modifying it as needed) to further confirm but that was nontrivial for me:

Precompiling MLJTransforms
        Info Given MLJTransforms was explicitly requested, output will be shown live 
WARNING: Method definition target_in_fit(Type) in module StatisticalTraits at [/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149) overwritten in module MLJTransforms at [/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118).
ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation.
  ? MLJTransforms
[ Info: Precompiling MLJTransforms [23777cdb-d90c-4eb0-a694-7c2b83d5c1d6]
WARNING: Method definition target_in_fit(Type) in module StatisticalTraits at [/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/.julia/packages/StatisticalTraits/4sp0J/src/StatisticalTraits.jl:149) overwritten in module MLJTransforms at [/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118](https://file+.vscode-resource.vscode-cdn.net/Users/essamwisam/Documents/GitHub/MLJTransforms/src/target_encoding/interface_mlj.jl:118).
ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation.
[ Info: Skipping precompilation since __precompile__(false). Importing MLJTransforms [23777cdb-d90c-4eb0-a694-7c2b83d5c1d6].

Thus, I can only say I believe this would solve the issue but can't be absolutely certain.

ablaom · 2024-07-02T06:56:41Z

@EssamWisam To help me diagnose, what exactly do you have at line 118 of MLJTransforms/src/target_encoding/interface_mlj.jl ? This will be your overloading of target_in_fit, the source of the complaint.

EssamWisam · 2024-07-02T17:12:02Z

I passed an incorrect argument instead of MMI.target_in_fit(::Type{<:TargetEncoder}) = true after fixing it works as expected. That is, I can form a pipeline of target encoder and random forest in my tutorial.

codecov · 2024-07-02T20:16:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.21%. Comparing base (8cb6f26) to head (63344b7).
Report is 23 commits behind head on dev.

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #984      +/-   ##
==========================================
+ Coverage   88.13%   88.21%   +0.08%     
==========================================
  Files          28       28              
  Lines        2587     2588       +1     
==========================================
+ Hits         2280     2283       +3     
+ Misses        307      305       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ablaom added 3 commits June 27, 2024 13:35

bump [compat] MLJModelInterface = "1.11", StatisticalTraits = "3.4"

aa613cb

bump 1.6

cb15208

make pipelines support Unsupervised with target in fit

4d37eed

tweak docstring

63344b7

ablaom merged commit b44e7cf into dev Jul 2, 2024
5 checks passed

ablaom deleted the pipelines-with-supervised-transformers branch July 2, 2024 20:48

This was referenced Jul 2, 2024

For a 1.6 release #987

Merged

Issue to trigger releases #345

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support in pipelines for `Unsupervised` models for which `target_in_fit` is `true` #984

Add support in pipelines for `Unsupervised` models for which `target_in_fit` is `true` #984

ablaom commented Jun 27, 2024 •

edited

Loading

ablaom commented Jun 27, 2024

EssamWisam commented Jun 28, 2024

ablaom commented Jun 30, 2024

EssamWisam commented Jul 2, 2024

ablaom commented Jul 2, 2024

EssamWisam commented Jul 2, 2024 •

edited

Loading

codecov bot commented Jul 2, 2024

Add support in pipelines for Unsupervised models for which target_in_fit is true #984

Add support in pipelines for Unsupervised models for which target_in_fit is true #984

Conversation

ablaom commented Jun 27, 2024 • edited Loading

ablaom commented Jun 27, 2024

EssamWisam commented Jun 28, 2024

ablaom commented Jun 30, 2024

EssamWisam commented Jul 2, 2024

ablaom commented Jul 2, 2024

EssamWisam commented Jul 2, 2024 • edited Loading

codecov bot commented Jul 2, 2024

Codecov Report

Add support in pipelines for `Unsupervised` models for which `target_in_fit` is `true` #984

Add support in pipelines for `Unsupervised` models for which `target_in_fit` is `true` #984

ablaom commented Jun 27, 2024 •

edited

Loading

EssamWisam commented Jul 2, 2024 •

edited

Loading