Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make test of iterator(...) more robust #972

Merged
merged 1 commit into from
Apr 23, 2024
Merged

Make test of iterator(...) more robust #972

merged 1 commit into from
Apr 23, 2024

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Apr 23, 2024

Resolves #971

I am perplexed about the sudden appearance of the fail in #971. It seems like an RNG issue, but we use StableRNGs in the testing and the git blame shows no change to the relevant MLJBase code in > 4 years. Interestingly, #969, which has passed tests, now fails for me locally also. Tests are failing locally for me in Julia 1.6 and Julia 1.10.

The test currently failing relies completely on the stability of the RNG used. In this PR I replace the test with a more robust one, which still uses StableRNGs but only checks that the RNG-generated counts are within their expected 95% confidence intervals, instead of checking their precise values. This is a more correct test anyhow.

I played around locally to ensure the tests only fails about 5% of the time, and then fixed the seed to ensure a pass. So if the "stable" RNG behaviour changes again unexpectantly, we have a good chance the test will survive the change 😄 .

@OkonSamuel
Copy link
Member

OkonSamuel commented Apr 23, 2024

It looks like an issue with StableRNG to me.

image

But Distributions.jl just released a new minor version few days back v0.25.108 with changes to the underlying implementation for AliasTable which is the sampler for Categorical distribution and what the MLJBase.sampler function called in the test code here returns.

Now when the rand function from this line in the of the test is hit, it calls this line of code which calls these lines which when traced leads to the newly added line of code from Distributions package.
I suspect this could also be the issue.

EDIT
The new release from Distributions.jl is in fact the cause of the problem. I tested locally pinning the versions to 0.25.107 and 0.25.108.
Tests passed for 0.25.107 but failed for 0.25.108.

Test with [email protected]

(@v1.10) pkg> activate --temp
  Activating new project at `~\Temp\jl_MOIlDh`

(jl_MOIlDh) pkg> add StableRNGs Distributions@0.25.107
   Resolving package versions...
    Updating 
⌃ [31c24e10] + Distributions v0.25.107
  [860ef19b] + StableRNGs v1.0.2
    Updating ...
  [49dc2e85] + Calculus v0.5.1
  [34da2185] + Compat v4.14.0
  [9a962f9c] + DataAPI v1.16.0
  [864edb3b] + DataStructures v0.18.20
⌃ [31c24e10] + Distributions v0.25.107
  [ffbed154] + DocStringExtensions v0.9.3
  [fa6b7ba4] + DualNumbers v0.6.8
  [1a297f60] + FillArrays v1.10.1
  [34004b35] + HypergeometricFunctions v0.3.23
  [92d709cd] + IrrationalConstants v0.2.2
  [692b3bcd] + JLLWrappers v1.5.0
  [2ab3a3ac] + LogExpFunctions v0.3.27
  [e1d29d7a] + Missings v1.2.0
  [77ba4419] + NaNMath v1.0.2
  [bac558e1] + OrderedCollections v1.6.3
  [90014a1f] + PDMats v0.11.31
  [21216c6a] + Preferences v1.4.3
  [1fd47b50] + QuadGK v2.9.4
  [189a3867] + Reexport v1.2.2
  [79098fc4] + Rmath v0.7.1
  [a2af1166] + SortingAlgorithms v1.2.1
  [276daf66] + SpecialFunctions v2.3.1
  [860ef19b] + StableRNGs v1.0.2
  [82ae8749] + StatsAPI v1.7.0
  [2913bbd2] + StatsBase v0.34.3
  [4c63d2b9] + StatsFuns v1.3.1
  [efe28fd5] + OpenSpecFun_jll v0.5.5+0
  [f50d1b31] + Rmath_jll v0.4.0+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.10.0
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays v1.10.0
  [10745b16] + Statistics v1.10.0
  [4607b0f0] + SuiteSparse
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.1.0+0
  [deac9b47] + LibCURL_jll v8.4.0+0
  [e37daf67] + LibGit2_jll v1.6.4+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.2+1
  [14a3606d] + MozillaCACerts_jll v2023.1.10
  [4536629a] + OpenBLAS_jll v0.3.23+4
  [05823500] + OpenLibm_jll v0.8.1+2
  [bea87d4a] + SuiteSparse_jll v7.2.1+1
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850b90] + libblastrampoline_jll v5.8.0+1
  [8e850ede] + nghttp2_jll v1.52.0+1
  [3f19e933] + p7zip_jll v17.4.0+2
        Info Packages marked with ⌃ have new versions available and may be upgradable.
julia> using StableRNGs, Distributions, Serialization

julia> rng = StableRNG(600);

julia> d = Categorical([0.1, 0.2, 0.7]);

julia> v1 = rand(rng, d, 1000);

julia> serialize("oldversion", v1);

julia> exit()

Test with [email protected]

(@v1.10) pkg> activate --temp
  Activating new project at `~\Temp\jl_FjXzfS`

(jl_FjXzfS) pkg> add StableRNGs Distributions@0.25.108
   Resolving package versions...
    Updating 
  [31c24e10] + Distributions v0.25.108
  [860ef19b] + StableRNGs v1.0.2
    Updating 
  [66dad0bd] + AliasTables v1.0.0
  [49dc2e85] + Calculus v0.5.1
  [34da2185] + Compat v4.14.0
  [9a962f9c] + DataAPI v1.16.0
  [864edb3b] + DataStructures v0.18.20
  [31c24e10] + Distributions v0.25.108
  [ffbed154] + DocStringExtensions v0.9.3
  [fa6b7ba4] + DualNumbers v0.6.8
  [1a297f60] + FillArrays v1.10.1
  [34004b35] + HypergeometricFunctions v0.3.23
  [92d709cd] + IrrationalConstants v0.2.2
  [692b3bcd] + JLLWrappers v1.5.0
  [2ab3a3ac] + LogExpFunctions v0.3.27
  [e1d29d7a] + Missings v1.2.0
  [77ba4419] + NaNMath v1.0.2
  [bac558e1] + OrderedCollections v1.6.3
  [90014a1f] + PDMats v0.11.31
  [21216c6a] + Preferences v1.4.3
  [1fd47b50] + QuadGK v2.9.4
  [189a3867] + Reexport v1.2.2
  [79098fc4] + Rmath v0.7.1
  [a2af1166] + SortingAlgorithms v1.2.1
  [276daf66] + SpecialFunctions v2.3.1
  [860ef19b] + StableRNGs v1.0.2
  [82ae8749] + StatsAPI v1.7.0
  [2913bbd2] + StatsBase v0.34.3
  [4c63d2b9] + StatsFuns v1.3.1
  [efe28fd5] + OpenSpecFun_jll v0.5.5+0
  [f50d1b31] + Rmath_jll v0.4.0+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.10.0
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [2f01184e] + SparseArrays v1.10.0
  [10745b16] + Statistics v1.10.0
  [4607b0f0] + SuiteSparse
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.1.0+0
  [deac9b47] + LibCURL_jll v8.4.0+0
  [e37daf67] + LibGit2_jll v1.6.4+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.2+1
  [14a3606d] + MozillaCACerts_jll v2023.1.10
  [4536629a] + OpenBLAS_jll v0.3.23+4
  [05823500] + OpenLibm_jll v0.8.1+2
  [bea87d4a] + SuiteSparse_jll v7.2.1+1
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850b90] + libblastrampoline_jll v5.8.0+1
  [8e850ede] + nghttp2_jll v1.52.0+1
  [3f19e933] + p7zip_jll v17.4.0+2
julia> using StableRNGs, Distributions, Serialization

julia> rng = StableRNG(600);

julia> d = Categorical([0.1, 0.2, 0.7]);

julia> v2 = rand(rng, d, 1000);

julia> v1 = deserialize("oldversion");

julia> v1 == v2
false

@ablaom
Copy link
Member Author

ablaom commented Apr 23, 2024

Yes, you are right. Somehow I missed the dependence on Distributions and thought this was Random. Great diagnosis, thank you.

With the source of the issue put to rest, any objections to my solution?

Copy link
Member

@OkonSamuel OkonSamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!

@ablaom ablaom merged commit f8b4c2c into dev Apr 23, 2024
3 checks passed
@ablaom ablaom deleted the fix-test branch April 23, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests failing on dev
2 participants