GEOGG121: Methods Monte Carlo methods, revision

Slides:

Advertisements

Similar presentations

Contrastive Divergence Learning

Advertisements

Bayesian Estimation in MARK

Introduction of Markov Chain Monte Carlo Jeongkyun Lee.

CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm.

Markov Networks.

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Bayesian statistics – MCMC techniques

BAYESIAN INFERENCE Sampling techniques

Optimization methods Morten Nielsen Department of Systems biology, DTU.

1 CE 530 Molecular Simulation Lecture 8 Markov Processes David A. Kofke Department of Chemical Engineering SUNY Buffalo

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

Machine Learning CUNY Graduate Center Lecture 7b: Sampling.

Computational statistics, course introduction Course contents  Monte Carlo Methods  Random number generation  Simulation methodology  Bootstrap  Markov.

End of Chapter 8 Neil Weisenfeld March 28, 2005.

Introduction to Monte Carlo Methods D.J.C. Mackay.

1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo

1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:

Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:

Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.

Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.

Markov Random Fields Probabilistic Models for Images

Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.

An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.

Lecture #9: Introduction to Markov Chain Monte Carlo, part 3

GEOGG121: Methods Inversion I: linear approaches Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel:

1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.

Inference of Non-Overlapping Camera Network Topology by Measuring Statistical Dependence Date ：

CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.

Lecture 18, CS5671 Multidimensional space “The Last Frontier” Optimization Expectation Exhaustive search Random sampling “Probabilistic random” sampling.

Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),

Estimating emissions from observed atmospheric concentrations: A primer on top-down inverse methods Daniel J. Jacob.

SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.

The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

Markov Chain Monte Carlo in R

Introduction to Sampling based inference and MCMC

Reducing Photometric Redshift Uncertainties Through Galaxy Clustering

MCMC Output & Metropolis-Hastings Algorithm Part I

Markov Chain Monte Carlo methods --the final project of stat 6213

Optimization of Monte Carlo Integration

Advanced Statistical Computing Fall 2016

High Performance Computing and Monte Carlo Methods

Intro to Sampling Methods

ERGM conditional form Much easier to calculate delta (change statistics)

Introducing Bayesian Approaches to Twin Data Analysis

Jun Liu Department of Statistics Stanford University

Markov Chain Monte Carlo

Markov chain monte carlo

Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.

Markov Networks.

Collaborative Filtering Matrix Factorization Approach

TexPoint fonts used in EMF.

Multidimensional Integration Part I

Instructors: Fei Fang (This Lecture) and Dave Touretzky

Ch13 Empirical Methods.

Lecture 15 Sampling.

Expectation-Maximization & Belief Propagation

Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics

Markov Networks.

Presentation transcript:

GEOGG121: Methods Monte Carlo methods, revision Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592 Email: mdisney@ucl.geog.ac.uk www.geog.ucl.ac.uk/~mdisney

Very brief intro to Monte Carlo Brute force method(s) for integration / parameter estimation / sampling Powerful BUT essentially last resort as involves random sampling of parameter space Time consuming – more samples gives better approximation Errors tend to reduce as 1/N1/2 N = 100 -> error down by 10; N = 1000000 -> error down by 1000 Fast computers can solve complex problems Applications: Numerical integration (eg radiative transfer eqn), Bayesian inference (posterior), computational physics, sensitivity analysis etc etc Numerical Recipes in C ch. 7, p304 http://apps.nrbook.com/c/index.html http://en.wikipedia.org/wiki/Monte_Carlo_method http://en.wikipedia.org/wiki/Monte_Carlo_integration

Basics: MC integration Pick N random points in a multidimensional volume V, x1, x2, …. xN MC integration approximates integral of function f over volume V as Where and +/- term is 1SD error – falls of as 1/N1/2 Choose random points in A Integral is fraction of points under curve x A From http://apps.nrbook.com/c/index.html

Basics: MC integration Why not choose a grid? Error falls as N-1 (quadrature approach) BUT we need to choose grid spacing. For random we sample until we have ‘good enough’ approximation Is there a middle ground? Pick points sort of at random BUT in such a way as to fill space more quickly (avoid local clustering)? Yes – quasi-random sampling: Space filling: i.e. “maximally avoiding of each other” FROM: http://en.wikipedia.org/wiki/Low-discrepancy_sequence Sobol method v pseudorandom: 1000 points

MC approximation of Pi? A simple example of MC methods in practice

MC approximation of Pi? A simple example of MC methods in practice In Python? import numpy as np a = np.random.rand(10,2) np.sum(a*a,1)<1 array([ True, True, False, False, True, False, True, False, True, True], dtype=bool) 4*np.mean(np.sum(a*a,1)<1) 2.3999999999999999

Markov Chain Monte Carlo (MCMC) Integration / parameter estimation / sampling From 80s: “It was rapidly realised that most Bayesian inference could be done by MCMC, whereas very little could be done without MCMC” (Geyer, 2010) Formally MCMC methods sample from probability distribution (eg a posterior) based on constructing a Markov Chain with the desired distribution as its equilibrium (tends to) distribution Markov Chain: system of random transitions where next state dpeends on only on current, not preceding chain (ie no “memory” of how we got here) Many implementations of MCMC including Metropolis-Hastings, Gibbs Sampler etc. From: http://homepages.inf.ed.ac.uk/imurray2/teaching/09mlss/slides.pdf See also: http://www.mcmchandbook.net/HandbookChapter1.pdf

MCMC: Metropolis-Hastings Initialise: pick a state x at random Pick a new candidate state x’ at random. Accept based on criteria Where A is the acceptance distribution, is the proposal distribution (conditional prob of proposing state x’, given x) Transition probability P of x -> x’ If not accepted then x’ = x (no change) OR state transits to x’ Repeat N times, save the new state x’ Repeat whole process From: http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm

Revision: key topics, points Model inversion – why? Forward model: model predicts system behaviour based on given set of parameter values (system state vector) f(x) BUT we usually want to observe system and INFER parameter values Inversion: f-1(x) - estimate the parameter values (system state) that give rise to observed values Forward modelling useful for understanding system, sensitivity analysis etc. Inverse model allows us to estimate system state

Revision: key topics, points Model inversion – How? Linear: pros and cons? Can be done using linear algebra (matrices) V fast but … Non-linear: pros and cons? Many approaches, all based around minimising some cost function: eg RMSE – difference between MODEL & OBS for a given parameter set Iterative – based on getting to mimimum as quickly as possible OR as robustly as possible OR with fewest function evaluations Gradient descent (L-BFGS); simplex, Powell (no gradient needed); LUT (brute force); simulated annealing; geneatic algorithms; artifical neural networks etc etc

Revision: key topics, points Model inversion – application Linear kernel-driven BRDF modelling requirement for global, near real-time satellite data product SO must be FAST MODIS BRDF product 3 param model: Isotropic (brightness) + Geometric-Optic (shadowing) + Volumetric (volume scattering) Two are (severe) approximations to radiative transfer models – only dependent on view/illum angles

Revision: key topics, points Analytical v Numerical Analytical Can write down equations for f-1(x) Can do fast Numerical No written expression for f-1(x) or perhaps even f(x) Need to approximate parts of it numerically Hard to differentiate (for inversion, gradient descent)

Don’t forget: Course feedback Short MC practical (now) Thanks! And have a great Christmas and New Year