Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive proposals #39

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open

Adaptive proposals #39

wants to merge 34 commits into from

Conversation

arzwa
Copy link

@arzwa arzwa commented Nov 22, 2020

Hi, I implemented some very basic adaptive Metropolis-Hastings proposals that may be of interest for the AdvancedMH.jl package. These are basically simple symmetric random-walk proposals where the scale is subjected to diminishing adaptation (see e.g. section 3 here). These are mainly of interest in Metropolis-within-Gibbs-like algorithms, and they could be of interest in Turing. For instance, for a simple test case:

data = rand(Normal(2., 1.), 500)
m1 = DensityModel(x -> loglikelihood(Normal(x,1), data))
p1 = RandomWalkProposal(Normal())
p2 = AdaptiveProposal(Normal())
c1 = sample(m1, MetropolisHastings(p1), 10000; chain_type=Chains)
c2 = sample(m1, MetropolisHastings(p2), 10000; chain_type=Chains)   
julia> ess(c1)
ESS
  parameters        ess      rhat 
      Symbol    Float64   Float64 

     param_1   605.4018    1.0013


julia> ess(c2)
ESS
  parameters         ess      rhat 
      Symbol     Float64   Float64 

     param_1   2222.6857    1.0006

Also, this is my first PR to a collaborative project, and am taking this as an opportunity to learn how to contribute to projects using git (hopefully with something useful). In other words, I'm not sure if I'm doing this right :)

If I find some time, and this is considered useful, I'd like to port some other related stuff I've used in other projects to AdvancedMH as well.

@arzwa arzwa changed the title Adaptive Adaptive proposals Nov 22, 2020
Copy link
Member

@cpfiffer cpfiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, what a lovely surprise to receive on a Sunday morning! This is great stuff -- really an excellent PR. It came with tests, the code is succinct and easy to review, and the feature is one I've wanted for a long time. Excellently done.

I left a handful of comments, but I think most of them are suggestions you should be able to commit directly by pressing "Commit suggestion". They're primarily style changes, but one docstring could use a little more info.

Thanks for the PR! I'm going to be really happy to have this merged in.

src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated
Comment on lines 10 to 20
Adaptor(; tune=25, target=0.44, bound=10., δmax=0.2) =
Adaptor(0, 0, tune, target, bound, δmax)

"""
AdaptiveProposal{P}

An adaptive Metropolis-Hastings proposal. In order for this to work, the
proposal kernel should implement the `adapted(proposal, δ)` method, where `δ`
is the increment/decrement applied to the scale of the proposal distribution
during adaptation (e.g. for a Normal distribution the scale is `log(σ)`, so
that after adaptation the proposal is `Normal(0, exp(log(σ) + δ))`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly describe the default Adaptor in the docstring for AdaptiveProposal, and notify everyone what the default keywords are for the adaptor?

src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
arzwa and others added 9 commits November 22, 2020 19:58
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Co-authored-by: Cameron Pfiffer <[email protected]>
Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I left some additional comments.

src/adaptive.jl Outdated
Comment on lines 1 to 4
mutable struct Adaptor
accepted::Integer
total ::Integer
tune ::Integer # tuning interval
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible one should avoid non-concrete fields:

Suggested change
mutable struct Adaptor
accepted::Integer
total ::Integer
tune ::Integer # tuning interval
mutable struct Adaptor
accepted::Int
total::Int
tune::Int # tuning interval

On a more general level, I'm not completely sure if it is useful to have a separate Adaptor struct, it seems it could just be integrated into AdaptiveProposal.

On an even more general level, I think it would be better to make this part of the state of the sampler using the AbstractMCMC interface instead of a field of the proposal. With the current design, the proposal will be mutated in every step. However, this (IMO preferred) design requires to implement AbstractMCMC.step instead of just adding the accept! call.

Copy link
Author

@arzwa arzwa Nov 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course, Int <-> Integer confusion...

The Adaptor struct may indeed be superfluous, although I found it a bit clearer separated that way. Also, I considered implementing adaptation for multivariate Normal proposals, which uses a different machinery under the hood, and my initial thought was to implement that as an AdaptiveProposal but with different Adaptor type. Of course, that could be implemented as another proposal struct altogether.

I think I understand conceptually your preferred design at the step level, although ATM my insight in how AbstractMCMC works is insufficient to see how that should be done, and currently, to me the mutation of the proposals is the most intuitive approach to implement adaptation. The accept! call seemed like a very simple, but admittedly somewhat hacky, way to enable adaptive proposals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should punt this problem to a later date. I would like to include accept/reject as a field in the Transition struct, which would make it very easy to count the number of previous acceptances by just adding adding one to the total_acceptances field in a Transition. Currently AdvancedMH doesn't track that internally, but I can just modify this code to remove the mutation later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only about the number of accepted/rejected steps here though, the state would have to include the updated proposal etc as well, so it won't be solved by including the stats in Transition.

However, I'm fine with postponing this refactoring. Probably best to open an issue so we don't forget it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my thinking, you'd add an extra field to Transitions that just accumulates the total number of acceptances, which is easier to get when you have individual acceptances for each draw. I'll open an issue up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand (and I think that's a good addition). My point was just that it is not sufficient here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened an issue (#40).

src/adaptive.jl Outdated
"""
mutable struct AdaptiveProposal{P} <: Proposal{P}
proposal::P
adaptor ::Adaptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written above, one might want to consider moving the fields from Adaptor here and maybe even better making it part of a state that is passed to the proposal in every sampling step. Also the updated proposal would be part of a separate state.

src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
src/adaptive.jl Outdated Show resolved Hide resolved
@devmotion
Copy link
Member

The test failures seem to be real, the sampled values do not match the expected ones. As a side remark, probably it would be helpful for debugging to use sample(...; progress = false) in the tests.

@arzwa
Copy link
Author

arzwa commented Nov 23, 2020

The test failures seem to be real, the sampled values do not match the expected ones. As a side remark, probably it would be helpful for debugging to use sample(...; progress = false) in the tests.

Where do you see this? As far as I can tell, the three failed runs show the following results:

Test Summary:                            | Pass  Fail  Error  Total
AdvancedMH                               |   20     2      1     23
  StaticMH                               |    4                   4
  RandomWalk                             |    4                   4
  Adaptive random walk                   |    4                   4
  Compare adaptive to simple random walk |    1                   1
  parallel sampling                      |    2     2             4
  Proposal styles                        |    4                   4
  Initial parameters                     |    1                   1
  EMCEE                                  |    4                   1

(which is the same as what I have locally). Maybe I'm looking in the wrong place.

@devmotion
Copy link
Member

devmotion commented Nov 23, 2020

Where do you see this? As far as I can tell, the three failed runs show the following results:

I looked at the raw logs, e.g., https://pipelines.actions.githubusercontent.com/1iyxZmDwGspc6UbQKDLTZ8y3DHazaAGCEXSw02G4avGy4Eell5/_apis/pipelines/1/runs/7302/signedlogcontent/15?urlExpires=2020-11-23T10%3A22%3A22.7581222Z&urlSigningMethod=HMACV1&urlSignature=%2FUDiYzMH2mBCIhIroN0uueS%2FjYsHAv7zGwfFj%2FCie90%3D. The nicely formatted output of the Github Actions is truncated since progress bars are printed (which is why I suggested to use sample(...; progress = false)).

The raw logs are difficult to read but according to the final statement there are two test failures somewhere. And actually, if you scroll up a bit you can find


2020-11-23T07:54:54.1228900Z   Expression: ≈(mean(chain1["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:54.8851780Z    Evaluated: 1.9981419463187078 ≈ 0.0 (atol=0.1)
2020-11-23T07:54:54.8852673Z Stacktrace:
2020-11-23T07:54:54.8853787Z  [1] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9081715Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:87�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9083060Z  [2] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9084504Z �[90m   @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9086203Z  [3] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9087509Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:83�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9088694Z  [4] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9090192Z �[90m   @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9091420Z  [5] top-level scope
2020-11-23T07:54:54.9092587Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:11�[0m
2020-11-23T07:54:55.3782818Z �[33m�[1m┌ �[22m�[39m�[33m�[1mWarning: �[22m�[39mOnly a single thread available: MCMC chains are not sampled in parallel
2020-11-23T07:54:55.3785177Z �[33m�[1m└ �[22m�[39m�[90m@ AbstractMCMC ~/.julia/packages/AbstractMCMC/lGdBK/src/sample.jl:204�[39m
2020-11-23T07:54:55.5398596Z �[32mSampling (1 threads)   0%|                              |  ETA: N/A�[39m
2020-11-23T07:54:56.5060198Z �[32mSampling (1 threads)  25%|███████▌                      |  ETA: 0:00:03�[39m
2020-11-23T07:54:56.5061792Z �[32mSampling (1 threads)  50%|███████████████               |  ETA: 0:00:01�[39m
2020-11-23T07:54:56.5063683Z �[32mSampling (1 threads)  75%|██████████████████████▌       |  ETA: 0:00:00�[39m
2020-11-23T07:54:56.5065281Z �[32mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m
2020-11-23T07:54:56.5066874Z �[90mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m
2020-11-23T07:54:56.5657764Z �[37mparallel sampling: �[39m�[91m�[1mTest Failed�[22m�[39m at �[39m�[1m/home/runner/work/AdvancedMH.jl/AdvancedMH.jl/test/runtests.jl:93�[22m
2020-11-23T07:54:56.5659855Z   Expression: ≈(mean(chain2["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:56.5865659Z    Evaluated: 1.9910977829961962 ≈ 0.0 (atol=0.1)

which I was referring to. Apparently the sample average of the samples of μ do not match the expected value 0 (and actually, differ quite significantly).

I just checked locally, and all tests pass with the latest release and with master. So it seems this PR breaks something.

test/runtests.jl Outdated Show resolved Hide resolved
@arzwa
Copy link
Author

arzwa commented Nov 23, 2020

Where do you see this? As far as I can tell, the three failed runs show the following results:

I looked at the raw logs, e.g., https://pipelines.actions.githubusercontent.com/1iyxZmDwGspc6UbQKDLTZ8y3DHazaAGCEXSw02G4avGy4Eell5/_apis/pipelines/1/runs/7302/signedlogcontent/15?urlExpires=2020-11-23T10%3A22%3A22.7581222Z&urlSigningMethod=HMACV1&urlSignature=%2FUDiYzMH2mBCIhIroN0uueS%2FjYsHAv7zGwfFj%2FCie90%3D. The nicely formatted output of the Github Actions is truncated since progress bars are printed (which is why I suggested to use sample(...; progress = false)).

The raw logs are difficult to read but according to the final statement there are two test failures somewhere. And actually, if you scroll up a bit you can find


2020-11-23T07:54:54.1228900Z   Expression: ≈(mean(chain1["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:54.8851780Z    Evaluated: 1.9981419463187078 ≈ 0.0 (atol=0.1)
2020-11-23T07:54:54.8852673Z Stacktrace:
2020-11-23T07:54:54.8853787Z  [1] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9081715Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:87�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9083060Z  [2] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9084504Z �[90m   @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9086203Z  [3] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9087509Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:83�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9088694Z  [4] �[0m�[1mmacro expansion�[22m
2020-11-23T07:54:54.9090192Z �[90m   @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m
2020-11-23T07:54:54.9091420Z  [5] top-level scope
2020-11-23T07:54:54.9092587Z �[90m   @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:11�[0m
2020-11-23T07:54:55.3782818Z �[33m�[1m┌ �[22m�[39m�[33m�[1mWarning: �[22m�[39mOnly a single thread available: MCMC chains are not sampled in parallel
2020-11-23T07:54:55.3785177Z �[33m�[1m└ �[22m�[39m�[90m@ AbstractMCMC ~/.julia/packages/AbstractMCMC/lGdBK/src/sample.jl:204�[39m
2020-11-23T07:54:55.5398596Z �[32mSampling (1 threads)   0%|                              |  ETA: N/A�[39m
2020-11-23T07:54:56.5060198Z �[32mSampling (1 threads)  25%|███████▌                      |  ETA: 0:00:03�[39m
2020-11-23T07:54:56.5061792Z �[32mSampling (1 threads)  50%|███████████████               |  ETA: 0:00:01�[39m
2020-11-23T07:54:56.5063683Z �[32mSampling (1 threads)  75%|██████████████████████▌       |  ETA: 0:00:00�[39m
2020-11-23T07:54:56.5065281Z �[32mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m
2020-11-23T07:54:56.5066874Z �[90mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m
2020-11-23T07:54:56.5657764Z �[37mparallel sampling: �[39m�[91m�[1mTest Failed�[22m�[39m at �[39m�[1m/home/runner/work/AdvancedMH.jl/AdvancedMH.jl/test/runtests.jl:93�[22m
2020-11-23T07:54:56.5659855Z   Expression: ≈(mean(chain2["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:56.5865659Z    Evaluated: 1.9910977829961962 ≈ 0.0 (atol=0.1)

which I was referring to. Apparently the sample average of the samples of μ do not match the expected value 0 (and actually, differ quite significantly).

I just checked locally, and all tests pass with the latest release and with master. So it seems this PR breaks something.

OK, thanks, I saw that too but was confused since I don't think I changed anything related to parallel sampling, and I thought you were referring to the AdaptiveProposal tests. I'll have a look.

Co-authored-by: David Widmann <[email protected]>
end

@testset "Compare adaptive to simple random walk" begin
data = rand(Normal(2., 1.), 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arzwa This might be the problem - you redefine data and hence I assume you implicitly change the density model which is defined as a closure over data.

You could just rename the variable here but actually I think the better approach might be to "fix" the data in the model to avoid any such surprises in the future.

I guess this can be achieved by defining

density = let data = data
  θ -> insupport(θ) ? sum(logpdf.(dist(θ), data)) : -Inf
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I haven't tested it, so make sure it actually fixes the problem 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just saw this too, thanks. I'll check and push an updated test suite. (Actually, we could just as well test against the same data defined above in the test suite, but I find testing against a mean different from 0 a bit more reassuring since the sampler actually has to 'move' to somewhere from where it starts).

Comment on lines +34 to +36
kwargs = (progress=false, chain_type=StructArray, param_names=["μ", "σ"])
chain1 = sample(model, spl1, 100000; kwargs...)
chain2 = sample(model, spl2, 100000; kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we maybe just define

_sample(args...; kwargs...) = sample(args...; progress = false, kwargs...)

at the top of the test suite and just replace all occurrences of sample with _sample? I think the kwargs definitions are a bit uncommon...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I often define a named tuple for kwargs that are repeatedly used, but could be an idiosyncratic thing :)

@arzwa
Copy link
Author

arzwa commented Dec 8, 2020

Hi there, a question related to this PR (feel free to move it to the issue section, I'm not sure what the best place to drop this is).

I'm currently facing an application where I would really like to use adaptive proposals like those defined in this PR in a Metropolis-within-Gibbs setting (i.e. we have a parameter vector x, for each parameter have an adaptive univariate proposal, and in each iteration of the MCMC sampler we update each component of the parameter vector conditional on the others using a Metropolis-Hastings step). The Turing-way to go would seem to use the stuff implemented in AdvancedMH in a Turing composite Gibbs sampler (something roughly like Gibbs(AdaptiveMH(:p1), AdaptiveMH(:p2), ...) where the p1, p2, ... are the parameter vector components)? I think in general this is worthwhile for low-dimensional applications where the gradient of the loglikelihood is really costly or unavailable. I wonder what would be the best way to proceed to allow this? Thanks for any hints!

@cpfiffer
Copy link
Member

cpfiffer commented Dec 8, 2020

Yeah, this should probably be a separate issue -- can you copy it over and I'll drop my comments there?

It looks like there's only one thing to change in this PR -- if that is updated I'd be happy to merge this one in.

@arzwa
Copy link
Author

arzwa commented Dec 11, 2020

OK, I had implemented multivariate normal adaptive proposals and had it working so thought to add it to this PR. Not sure how the test pipeline got messed up?

Comment on lines +58 to +59
# Adaptive proposals are only defined for symmetric proposal distributions
is_symmetric_proposal(::AdaptiveProposal) = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the case, it would be good to check this property in an inner constructor.

Comment on lines +81 to +84
function q(proposal::AdaptiveProposal, t, t_cond)
return logpdf(proposal, t - t_cond)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed? It seems like the default for Proposal should cover this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there seems to be no default q for Proposal. However it's also unnecessary because these adaptive proposals are always symmetric.

src/adaptivemvnormal.jl Outdated Show resolved Hide resolved
src/adaptivemvnormal.jl Outdated Show resolved Hide resolved
Comment on lines +56 to +59
F = cholesky(Hermitian(Σ), check=false)
if rank(F.L) == p.d
p.adaptive = MvNormal(PDMat(Σ, F))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, can't we just use the default constructors of MvNormal which call PDMat implicitly?

Suggested change
F = cholesky(Hermitian(Σ), check=false)
if rank(F.L) == p.d
p.adaptive = MvNormal(PDMat(Σ, F))
end
p.adaptive = MvNormal(Σ)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented it this way (inspired by the Mamba implementation) to check whether the empirical estimate of the covariance matrix is a proper covariance matrix. If it is we can adapt the proposal distribution without additional computation by feeding the cholesky decomposition to MvNormal this way. How would you do it otherwise, I guess you could use try/catch but that doesn't seem nicer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think a try-catch block would be any better. IMO it is not good to just suppress the errors here and use an incorrect Cholesky decomposition with possibly negative eigenvalues. I would suggest the following:

  • If possible, update the Cholesky decomposition (or some other parametrization of the covariance matrix) iteratively. Usually this is more efficient than recomputing the factorization in every step and it avoid the numerical problems.
  • Often calling Matrix(Symmetric(Sigma)) is sufficient to avoid problems with matrices that are not completely symmetric and for which therefore the Cholesky decomposition fails.
  • Alternatively, use PositiveFactorizations.

IMO the first alternative would clearly be the best, if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a quick look at Haario et al. and the update formula of the covariance matrix in eq (3). It seems that indeed the update could be performed by only d + 3 rank 1 up- and downdates which each can be performed in O(d^2). Actually, if eps = 0, then only 3 rank 1 up- and downdates are needed and therefore the iterative update of the Cholesky decomposition would be much cheaper than recomputing the decomposition in O(d^3) operations. The rank 1 up- and downdates are implemented in LinearAlgebra.lowrankupdate! and LinearAlgebra.lowrankdowndate! (and their corresponding out-of-place methods).

I think this would be the preferable method for updating the Cholesky decomposition. Probably it would also be cheaper to save just the Cholesky decomposition and not wrap it in a MvNormal object since it is trivial to sample from the multivariate normal distribution with the Cholesky decomposition.

Copy link
Author

@arzwa arzwa Dec 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds interesting and worthwhile, have to look at it. FWIW I just based my code on the description in the Roberts & Rosenthal paper and the Mamba implementation, since I simply wanted to use this for a problem I'm having.

I had the impression from the code that there is a preference to use Distributions.jl distributions for sampling, that's why I wrapped them in MvNormal objects instead of using the Cholesky factor directly (when both the full matrix and cholesky factors are available, it is as cheap to put them in an MvNormal). When updating the cholesky factor directly I agree it would be better to to do as you suggest.

src/adaptivemvnormal.jl Outdated Show resolved Hide resolved
Comment on lines +70 to +73
function propose(rng::Random.AbstractRNG, proposal::AdaptiveMvNormal, m::DensityModel)
return rand(rng, proposal)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not covered by the default for Proposal?

src/mh-core.jl Outdated Show resolved Hide resolved
test/Project.toml Outdated Show resolved Hide resolved
@@ -18,7 +18,7 @@
Random.seed!(100)
sampler = Ensemble(1_000, StretchProposal([InverseGamma(2, 3), Normal(0, 1)]))
chain = sample(model, sampler, 1_000;
param_names = ["s", "m"], chain_type = Chains)
param_names = ["s", "m"], chain_type = Chains, progress = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yebai
Copy link
Member

yebai commented Jan 28, 2021

Anything missing from this PR? If there is nothing major standing in the way, maybe we should revisit and get it merged asap.

@arzwa
Copy link
Author

arzwa commented Jan 28, 2021

Hi, I should have a look at it again, @devmotion had some good suggestions on the multivariate adaptive normal kernel, which I did not yet look into further at as I got busy with other stuff. The univariate adaptive proposals should be fine I think.

@cnrrobertson
Copy link
Contributor

Is this still of interest/relevant? I'd definitely be interested in using this and may be able to help finish it up.

@yebai
Copy link
Member

yebai commented Sep 13, 2023

Is this still of interest/relevant? I'd definitely be interested in using this and may be able to help finish it up.

It's still relevant; please feel free to take this over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants