" The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22, 2009 1.

Slides:



Advertisements
Similar presentations
A Fast PTAS for k-Means Clustering
Advertisements

GSA Pizza Talk - EPFL - Capillary routing with FEC by E. Gabrielyan 1 Capillary Multi-Path Routing for reliable Real-Time Streaming with FEC.
Laws of Reflection From the Activity you performed, when you shine an incident light ray at a plane mirror, the light is reflected off the mirror and forms.
© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Algorithm Design Techniques
Generative Design in Civil Engineering Using Cellular Automata Rafal Kicinger June 16, 2006.
Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.
Rachel T. Johnson Douglas C. Montgomery Bradley Jones
1 Random Sampling from a Search Engines Index Ziv Bar-Yossef Maxim Gurevich Department of Electrical Engineering Technion.
Chapter 15: The Milky Way Galaxy
Introduction to Monte Carlo Markov chain (MCMC) methods
Astronomical Solutions to Galactic Dark Matter Will Sutherland Institute of Astronomy, Cambridge.
Chapter 7 Sampling and Sampling Distributions
Utility Optimization for Event-Driven Distributed Infrastructures Cristian Lumezanu University of Maryland, College Park Sumeer BholaMark Astley IBM T.J.
Lectures 6&7: Variance Reduction Techniques
Von Karman Institute for Fluid Dynamics RTO, AVT 167, October, R.A. Van den Braembussche von Karman Institute for Fluid Dynamics Tuning of Optimization.
Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Cache and Virtual Memory Replacement Algorithms
Chapter 10: Virtual Memory
Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Halo White Dwarf Controversy Painting by Lynette Cook Ben R. Oppenheimer UC-Berkeley Nigel Hambly, Andrew Digby University of Edinburgh Simon Hodgkin Cambridge.
1 General Iteration Algorithms by Luyang Fu, Ph. D., State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP 2007 CAS.
Addition 1’s to 20.
25 seconds left…...
Week 1.
This is a journey which starts and ends in distances difficult for the human mind to capture.
STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION GROUP
Choosing an Order for Joins
Basics of Statistical Estimation
Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.
Programming exercises: Angel – lms.wsu.edu – Submit via zip or tar – Write-up, Results, Code Doodle: class presentations Student Responses First visit.
[Part 4] 1/25 Stochastic FrontierModels Production and Cost Stochastic Frontier Models William Greene Stern School of Business New York University 0Introduction.
The Halo of the Milky Heidi Jo Newberg Rensselaer Polytechnic Institute.
Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/
Sagittarius debris in SDSS stripe 82 Zhu Ling ( 朱玲 ) & Martin. C. Smith Center for Astrophysics, Tsinghua university KIAA at Peking University.
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
TEMPLATE DESIGN © Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC.
Today Introduction to MCMC Particle filters and MCMC
Levels of organization: Stellar Systems Stellar Clusters Galaxies Galaxy Clusters Galaxy Superclusters The Universe Everyone should know where they live:
From Analyzing the Tuberculosis Genome to Modeling the Milky Way Galaxy Using Volunteer Computing for Computational Science Travis Desell Department of.
1 December 12, 2009 Robust Asynchronous Optimization for Volunteer Computing Grids Department of Computer Science Department of Physics, Applied Physics.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
1/27 Discrete and Genetic Algorithms in Bioinformatics 許聞廉 中央研究院資訊所.
EE459 I ntroduction to Artificial I ntelligence Genetic Algorithms Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.
ORBITAL DECAY OF HIGH VELOCITY CLOUDS LUMA FOHTUNG UW-Madison Astrophysics REU 2004 What is the fate of the gas clouds orbiting the MilkyWay Galaxy?
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
and Volunteer Computing at RPI Travis Desell RCOS, April 23, 2010.
Biologically inspired algorithms BY: Andy Garrett YE Ziyu.
1 Motion Fuzzy Controller Structure(1/7) In this part, we start design the fuzzy logic controller aimed at producing the velocities of the robot right.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Genetic algorithms: A Stochastic Approach for Improving the Current Cadastre Accuracies Anna Shnaidman Uri Shoshani Yerach Doytsher Mapping and Geo-Information.
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
The Standard Genetic Algorithm Start with a “population” of “individuals” Rank these individuals according to their “fitness” Select pairs of individuals.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Spacetime Constraints Chris Moore CS 552. Purpose Physically Accurate Realistic Motion.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Genetic Algorithm(GA)
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Warehouse Lending Optimization Paul Parker (2016).
Markov Chain Monte Carlo in R
Robust Asynchronous Optimization Using Volunteer Computing Grids
Markov chain monte carlo
Jon Purnell Heidi Jo Newberg Malik Magdon-Ismail
Boltzmann Machine (BM) (§6.4)
MCMC Inference over Latent Diffeomorphisms
Presentation transcript:

" The Maximum Likelihood Problem and Fitting the Sagittarius Dwarf Tidal Stream " Matthew Newby Astronomy Seminar RPI Oct. 22,

2 Introduction The Sagittarius Stream SDSS Locating Maximum Likelihood Methods Differential Evolution Monte-Carlo Markov-Chain Gradient Descent Genetic Search Particle Swarm Revisit the Sagittarius Stream BOINC Overview Current and Future Work Overview:

3 Introduction Modern Astronomy – No longer staring through a telescope Automated Surveys produce large data sets Image : NASA.gov Errors in measurements – statistical methods needed Fast and accurate computer routines are needed in order to analyze this information! Image : Wikimedia Commons computer$ go faster_

4 The Sloan Digital Sky Survey (SDSS): Image: sdss.org 230+ million objects 8,400 square degrees in the sky Large percentage of north galactic cap Very little data in galactic plane (too much dust) Several hundred thousand stars

The Sagittarius Dwarf Tidal Stream 5 Image (above): [Ibata et al. 1997, AJ] Image (left): David Martinez-Delgado (MPIA) & Gabriel Perez (IAC) The Sagittarius Dwarf Galaxy is merging with the Milky Way The dwarf is being tidally disrupted by the Milky Way, creating long tails. Provide information on matter distribution in Milky Way Provide constraints on Galactic Halo Mapping the Tidal Stream will:

6 The Milky Way: Halo Bulge Thin Disk Thick Disk ~30 kiloparsecs (100,000 light-years) Sun Sagittarius Dwarf Galaxy Tidal Stream Data Wedge

7 Data Stripe: Stripe 82 (southern galactic cap) F-turnoff stars on the H-R diagram Image: Newberg & Yanny 2006, JoP Conference series (modified by N. Cole

8 Cole, N. Sag. Stream: Model Assume stream is a cylinder Radial drop-off given by a Gaussian Distribution 2 background parameters r0, q 6 parameters per stream ε, μ, r, θ, φ, σ At least 8 parameters in the search – 8-dimensional solutions space! Background distribution:

9 Maximum Likelihood: Bayesian Method Must assume a prior – a model explaining the data Find the parameters that are the most likely in a data set, given the prior Law of large numbers Can assume that large data sets have normally distributed data points Find probability that each data point lies in the given distribution The you can get the likelihood: L(Q|D) = DataPointProb i

10 Computational Algorithms Overview: Set up problem Parameter space: all allowed values of parameters Likelihood evaluator for given parameters Evaluation method – moves in parameter space in an efficient way End conditions: when change in best is below a limit, or a predefined number of iterations is reached. Problems: Likelihood calculation is usually time-consuming Need to avoid local maximums – find global max What is the best method?

11 Computational Methods: No Free Lunch (David H. Wolpert, William G. Macready) Only eats meat Vegetarian Low Carb Diet Poor Students: Prices differ by restaurant! Not everyone can eat cheaply! One restaurant cannot be the best solution for every person (problem)! Burger PalaceGourmet SaladsNo Carbs at All Local Eateries, same menus, random prices: One solution method (or algorithm) will not be ideal for all problems! Need to choose the best solution for the job at hand! RosencrantzOpheliaGuildenstern

Conjugate Gradient Descent (CGD) 12 Calculates the gradient of the surface for each parameter Moves towards best likelihood using a line search Conjugate gradient uses the gradient of the previous step to converge faster Requires many likelihood calculations per move Unfortunately, may end at local maximums Need to run from several different directions in order to find global best Gradient Descent: 1-dimensional case location gradient Likelihood vs. Position best solution Local Maximum L = likelihood function Q = Parameter (i or j) hi = step size for ith parameter The gradient, G:

13 Line Search example (left): The first search does not find a better likelihood for the middle point (yellow), so the distance is doubled. This time, the new middle point (red) has the best likelihood. The next iteration of CGD will start at this point. Line Search starting point first middle point first end point next middle point next end point Evaluates two points in direction of gradient: one a distance 1d away, the other 2d d is usually related to the gradient (slope) If the middle point is not at a better likelihood than the end points, d is doubled and the process repeated If the middle point is higher, then the middle point becomes the starting point for another CGD Line Search causes the algorithm to reach the best likelihood efficiently

14 Monte-Carlo Markov-Chain (MCMC) A random walk method Samples parameter space well Automatically produces error distribution Easy to code Sensitive to running time and step size Never truly converges Metropolis-Hastings: Take a step in each direction (parameter) Step size/direction is random, drawn from a normal distribution If the new location has a better likelihood, move to it If the new location has a worse likelihood, then there is a chance of moving to it The trajectory of a 1000 step MCMC straight-line fit (top) and the distribution in b (bottom).

15 Genetic Search Inspired by natural selection Start with multiple individuals (positions) in parameter space Evaluate likelihood for each individual Remove individuals with the worst likelihoods Replace the removed individuals with children of the remaining individuals (parents) Parents can be chosen randomly or from the best likelihoods Create children through crossover and mutation: Crossover: A child inherits the parameters of multiple parents, either by averaging the parents parameters or by inheriting select parameters from each parent Mutation: Replace a parameter with a new, randomly generated one Repeat until end conditions are met

Differential Evolution 16 An individual moves according to the weighted difference between the locations of two parent individuals If the new position has a worse likelihood, then the individual does not move Parents may be random or chosen from the population best Also, multiple pairs of parents may be used (averaging over the differences) (center is global best) Difference Vector Change in position X No Change

Particle-Swarm Optimization 17 Physically Intuitive – based on animal behavior Particles have velocities Forces towards personal best, global best particle Global best velocity to global best to personal best Personal best Parameter Space Position (x) change at step t: w, c1,c2 are weighting parameters, p is personal best, g is global best, rand() is a random number

18 BOINC Berkeley Open Infrastructure for Network Computing stats:TotalActive Users37,25116,010 Hosts79,02325,101 Teams1, Countries Total Credit9,302,434,280 Recent average credit RAC52,731,529 Average floating point operations per second 527,315.3 GigaFLOPS / TeraFLOPS Users volunteer spare processor / graphics card time to the project Massively parallel Graphics processor technology has created a large increase in processing power is now the #2 ranked BOINC project You can help, too:

19 Sgr Stream StarsNon-Sgr Stream StarsSgr Stream Stars Separation: Stripe 82

20 Conclusions: Modern astronomy produces large data sets The Maximum Likelihood method is ideal for analyzing this data Powerful computer algorithms exist to perform MLE Mapping the Sagittarius Stream is possible by using these methods

21 The Sloan Digital Sky Survey BOINC.com Prof. Heidi Newberg, Rensselaer Polytechnic Institute Nathan Cole, Maximum Likelihood Fitting of Tidal Streams with Applications to the Sagittarius Dwarf Tidal Tails (PhD Thesis, Rensselaer Polytechnic Institute, 2008) Travis Desell, Aysnchronous [sic] Global Optimization for Massively Distributed Computing (PhD candidacy document, 2009) Shakespeare, et al. Hamlet Credits

22 3 stream search: