Efficient MCMC for cosmological parameters. Antony Lewis Institute of Astronomy, Cambridge http://cosmologist.info/. collaborator: Sarah Bridle. CosmoMC:. http://cosmologist.info/cosmomc Lewis, Bridle: astro-ph/0205436 http://cosmologist.info/notes/cosmomc.ps.gz. Introduction.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Institute of Astronomy, Cambridge
collaborator: Sarah Bridle
CMB PolarizationBaryon oscillations
Galaxy power spectrum
Cluster gas fraction
WMAP Science Team
Generate samples from P
Samples in6D parameterspace
Chain distribution converges to P if all points can be reached (ergodic) andT leaves the distribution invariant
for all θ’
for (nearly) arbitrary proposal function q and
Simplest if q chosen to be symmetric
If proposed position has lower probability: move to x2 with probability P(x2)/P(x1)
otherwise take another sample at x1
Procedure:- Chose priors and parameterization
- Chose a proposal distribution- Start chain in some random position (or several in different starting positions)- Iterate large sequence of chain steps storing chain positions (correlated samples)- Remove ‘some’ samples from the beginning (‘burn in’)
- Optionally thin the samples- Check for convergence
- Estimate desired quantities from samples (or just publish list of samples)
Potential Problems: - Proposal width low or directions bad: slow sqrt(N) random walk through space
- Proposal width too large: low probability of proposing similar likelihood, low acceptance rate- Proposal distribution must be able to reach all points in parameter space
Can never cross gap!
From single chain:
From multiple chains (e.g. MPI):
Gelman-Rubin (last half chains): - variance of means - diagonalized variance of means - diagonalized variance of limits
All necessary but not sufficient!
Essentially no method is guaranteed to give correct answers in reasonable time!
Kosowsky et al. astro-ph/0206014
Proposal distribution: changing variables, transform to near-isotropic Gaussian
No change in priors
Equivalent to proposing alongeigenvector directions
Easily done automatically from estimate of covariance matrix
Proposal distribution: dynamically adjusting parameters near-isotropic Gaussian
Beware of non-Markovian updates…OK methods:
BAD methods (sample from wrong distribution, e.g. astro-ph/0307219): - Estimate local covariance and use for proposal distribution
- Dynamically adjust proposal width based on local acceptance rate - Propose change to some parameters then adjust other parameters to find good fit, then take new point as final proposal
The onus is on you to prove any new method leaves the target distribution invariant!
Symmetric Metropolis proposal distributions near-isotropic Gaussian
- Most common choice is a n-D Gaussian (n may be a sub-space dimension)
- in 2D
- Scale with to n get same acceptance rate in any dimension
- In general corresponds to choosing random direction, and moving normalizeddistance r in that direction with probability
‘Best’ scaling is r→r / 2.4 if posterior is GaussianDunkley et al astro-ph/0405462
Q: Is this really best, or just shortest autocorrelation length??
Is using a large- near-isotropic Gaussiann Gaussian proposal a good idea?e.g. consider Pn(r) as a proposal in 1D
– be careful how you assess performance!
Higher n, shorter correlation length, worse Raftery & Lewis
Definitely not in 1D… maybe not in general
Want robust to inaccurate size estimates near-isotropic Gaussian: broaden tail, make non-zero at zero:
e.g. mixture with exponential
1/3 exponential, n=2, 5 (n=2 used by CosmoMC)
Fast and Slow Parameters near-isotropic Gaussian
How do we use the fast/slow distinction efficiently?
Usual choice of near-isotropic Gaussianproposal directions:
‘Directional Gridding’ near-isotropic Gaussian
Slow grid width
chosen randomlyat start of eachgridding step: fullgridding step leavesdistribution invariant
Other methods: e.g. tempered transitions etc. may work for very fast parameters
Variation of: Neal 1996; Statistics and Computing 6, 353
Neal 2005, math.ST/0502099
e.g. parameterize reionization history in a set of bins of xe. What is your prior?
Lewis et al. 2006, in prep
Exact result is piecewise polynomial
Total optical depth prior ~ Gaussian (strongly peaked away from zero).Low optical depth is a priori very unlikely - is this really what you meant??
If you don’t have very good data worth thinking carefully about your prior!
Another example: reconstructing primordial power spectrum in n bins
Bridle et al. astro-ph/0302306
with flat uncorrelated priors on bins?
Differ by an effective prior on optical depth
Also, do we really mean to allow arbitrarily wiggly initial power spectra?
One sensible solution: Gaussian process prior (imposes smoothness) - result then converges for large n
Random directions? about your prior! Don’t really want to head back in same direction…
Helps in low dimensions
Also good for slice sampling (sampling_method=2,3) (Neal: physics/0009028)
Using fast-slow basis about your prior!:
- Less efficient per step because basis not optimal- Can compensate by doing more ‘fast’ transitions at near-zero cost- Ideally halves the number of ‘slow’ calculations
Can be generalized to n-dimensions: - make several proposals in fast subspace - make proposals in best slow eigendirections - save lots of time if many fast parameters
This is CosmoMC default