Materials Process Design and Control Laboratory A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS Baskar Ganapathysubramanian.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
Clustering and Dimensionality Reduction Brendan and Yifang April
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
“Random Projections on Smooth Manifolds” -A short summary
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Manifold Learning: ISOMAP Alan O'Connor April 29, 2008.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Diffusion Maps and Spectral Clustering
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Gwangju Institute of Science and Technology Intelligent Design and Graphics Laboratory Multi-scale tensor voting for feature extraction from unstructured.
Materials Process Design and Control Laboratory THE STEFAN PROBLEM: A STOCHASTIC ANALYSIS USING THE EXTENDED FINITE ELEMENT METHOD Baskar Ganapathysubramanian,
* Work supported by AFOSR/Computational Mathematics
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Binary Stochastic Fields: Theory and Application to Modeling of Two-Phase Random Media Steve Koutsourelakis University of Innsbruck George Deodatis Columbia.
Materials Process Design and Control Laboratory Finite Element Modeling of the Deformation of 3D Polycrystals Including the Effect of Grain Size Wei Li.
THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.
Uncertainty quantification in multiscale deformation processes Babak Kouchmeshky Nicholas Zabaras Materials Process Design and Control Laboratory Sibley.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Materials Process Design and Control Laboratory A DATA DRIVEN APPROACH FOR GENERATING REDUCED-ORDER STOCHASTIC MODELS OF RANDOM HETEROGENEOUS MEDIA Nicholas.
Materials Process Design and Control Laboratory Sethuraman Sankaran and Nicholas Zabaras Materials Process Design and Control Laboratory Sibley School.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
1 NON-LINEAR MODEL REDUCTION STRATEGIES FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS Nicholas Zabaras Materials Process Design and Control Laboratory.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
CSE 185 Introduction to Computer Vision Face Recognition.
Cornell University- Zabaras, FA An information-theoretic multiscale framework with applications to polycrystal materials Materials Process.
Dimensionality Reduction
Manifold learning: MDS and Isomap
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
1 LING 696B: MDS and non-linear methods of dimension reduction.
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Information Geometry and Model Reduction Sorin Mitran 1 1 Department of Mathematics, University of North Carolina, Chapel Hill, NC, USA Reconstruction.
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Materials Process Design and Control Laboratory Sibley School of Mechanical and Aerospace Engineering 169 Frank H. T. Rhodes Hall Cornell University Ithaca,
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.
Materials Process Design and Control Laboratory MULTISCALE COMPUTATIONAL MODELING OF ALLOY SOLIDIFICATION PROCESSES Materials Process Design and Control.
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Data Transformation: Normalization
Unsupervised Riemannian Clustering of Probability Density Functions
Machine Learning Basics
Dimensionality Reduction
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
ISOMAP TRACKING WITH PARTICLE FILTERING
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Goodfellow: Chapter 14 Autoencoders
Dimension reduction : PCA and Clustering
Lecture 15: Least Square Regression Metric Embeddings
NonLinear Dimensionality Reduction or Unfolding Manifolds
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Materials Process Design and Control Laboratory A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS Baskar Ganapathysubramanian and Nicholas Zabaras Materials Process Design and Control Laboratory Sibley School of Mechanical and Aerospace Engineering Cornell University Ithaca, NY

Materials Process Design and Control Laboratory - Thermal and fluid transport in heterogeneous media are ubiquitous - Range from large scale systems (geothermal systems) to the small scale - Most critical devices/applications utilize heterogeneous/polycrystalline/functionally graded materials TRANSPORT IN HETEROGENEOUS MEDIA - Properties depend on the distribution of material/microstructure - But only possess limited information about the microstructure/property distribution Incorporate limited information into stochastic analysis: - worst case scenarios - variations on physical properties Hydrodynamic transport through heterogeneous permeable media Thermal transport through polycrystalline and functionally graded materials

Materials Process Design and Control Laboratory PROBLEM OF INTEREST Interested in modeling diffusion through heterogeneous random media Aim: To develop procedure to predict statistics of properties of heterogeneous materials undergoing diffusion based transport What is given -Realistically speaking, one usually has access to a few experimental 2D images of the microstructure. Statistics of the heterogeneous microstructure can then be extracted from the same. - This is our starting point Account for the uncertainties in the topology of the heterogeneous media

Materials Process Design and Control Laboratory FRAMEWORK FOR ANALYSIS OF HETEROGENEOUS MEDIA Extract properties P1, P2,.. Pn, that the structure satisfies. These properties are usually statistical: Volume fraction, 2 Point correlation, auto correlation Reconstruct realizations of the structure satisfying the correlations. Construct a reduced stochastic model of property variations from the data. This model must be able to approximate the class of structures. Solve the heterogeneous property problem in the reduced stochastic space for computing property variations. 1. Property extraction 2. Microstructure/property reconstruction 3. Reduced model 4. Stochastic analysis

Materials Process Design and Control Laboratory DEVELOPING INPUT STOCHASTIC MODELS Data driven techniques for encoding the variability in properties into a viable, finite dimensional stochastic model. Advances in using Bayesian modeling, Random domain decomposition Aim is to create a seamless technique that utilizes the tools of the mature field of property/ microstructure reconstruction First investigations into constructing data-driven reduced order representation of topological/ material/ property distributions utilized a Principal Component Analysis (PCA/POD/KLE) based approach. Generate 3D samples from the microstructure space and apply PCA to them = a1a1 a2a anan + 1.B. Ganapathysubramanian, N. Zabaras, Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method, J Comp Physics, in press Convert variability of property/microstructure to variability of coefficients. Not all combinations allowed. Developed subspace reducing methodology 1 to find the space of allowable coefficients that reconstruct plausible microstructures

Materials Process Design and Control Laboratory INPUT STOCHASTIC MODELS: LINEAR APPROACH Further related issues: -How to generalize it to other properties/structures? Can PCA be applied to other classes of microstructures, say, polycrystals? -How does convergence change as the amount of information increases? Computationally? - PCA based approaches find the smallest coordinate representation of the data …. … but assumes that the data lies in a linear vector space What is the result when the data lies in a nonlinear space? As the number of input samples increases, PCA based approaches tend to overestimate the dimensionality of the reduced representation. Becomes computationally challenging # of samples # of eigen vectors Only guaranteed to discover the true structure of data lying on a linear subspace of the high dimensional input space NONLINEAR APPRACHES TO MODEL REDUCTION: IDEAS FROM IMAGE PROCESSING, PSYCOLOGY

Materials Process Design and Control Laboratory NONLINEAR REDUCTION: THE KEY IDEA Set of images. Each image = 64x64 = 4096 pixels Each image is a point in 4096 dimensional space. But each and every image is related (they are pictures of the same object). Same object but different poses. That is, all these images lie on a unique curve (manifold) in  Can we get a parametric representation of this curve? Problem: Can the parameters that define this manifold be extracted, ONLY given these images (points in  4096 ) Solution: Each image can be uniquely represented as a point in 2D space (UD, LR). Strategy: based on the ‘manifold learning’ problem Different images of the same object: changes in up-down (UD) and left-right (LR) poses

Materials Process Design and Control Laboratory NONLINEAR REDUCTION: EXTENSION TO INPUT MODELS Different microstructure realizations satisfying some experimental correlations Given some experimental correlation that the microstructure/property variation satisfies. Construct several plausible ‘images’ of the microstructure/property. Each of these ‘images’ consists of, say, n pixels. Each image is a point in n dimensional space. But each and every ‘image’ is related. That is, all these images lie on a unique curve (manifold) in  n. Can a low dimensional parameterization of this curve be computed? Strategy: based on a variant of the ‘manifold learning’ problem.

Materials Process Design and Control Laboratory A FORMAL DEFINITION OF THE PROBLEM State the problem as a parameterization problem (also called the manifold learning problem) Given a set of N unordered points belonging to a manifold  embedded in a high dimensional space  n, find a low dimensional region    d that parameterizes , where d << n Classical methods in manifold learning have been methods like the Principle Component Analysis (PCA) and multidimensional scaling (MDS). These methods have been shown to extract optimal mappings when the manifold is embedded linearly or almost linearly in the input space. In most cases of interest, the manifold is nonlinearly embedded in the input space, making the classical methods of dimension reduction highly approximate. Two approaches developed that can extract non-linear structures while maintaining the computational advantage offered by PCA 1,2. 1.J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), S Roweis, L. Saul., Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science 290 (2000)

Materials Process Design and Control Laboratory AN INTUITIVE PICTURE OF THE STRATEGY - Attempt to reduce dimensionality while preserving the geometry at all scales. - Ensure that nearby points on the manifold map to nearby points in the low- dimensional space and faraway points map to faraway points in the low dimensional space. 3D data PCA Linear approach Non-linear approach: unraveling the curve

Materials Process Design and Control Laboratory KEY CONCEPT Pt A Pt B Euclidian dist Geodesic dist 1)Geometry can be preserved if the distances between the points are preserved – Isometric mapping. 2)The geometry of the manifold is reflected in the geodesic distance between points 3)First step towards reduced representation is to construct the geodesic distances between all the sample points

Materials Process Design and Control Laboratory THE NONLINEAR MODEL REDUCTION ALGORITHM Given N unordered samples Compute pairwise geodesic distance 1)Given the N unordered sample points ( microstructures, property maps …) 2)Compute the geodesic distance between each pair of samples   (i,j). 3)Given the pairwise distance matrix between N objects, compute the location of N points, {ξ i } in  d such that the distance between these points is arbitrarily close to the given distance matrix  . Basic premise of group of statistical methods called Multi Dimensional Scaling 1 (MDS) Perform MDS on this distance matrix N points in a low dimensional space 1.T.F.Cox, M.A.A.Cox, Multidimensional scaling, 1994, Chapman and Hall

Materials Process Design and Control Laboratory How to compute geodesic distance? Sum over short hops. Need the notion of distance between samples Flexibility in defining the distance measure….. MATHEMATICAL DETAILS The distance measure defines the properties of the manifold that the samples lie on 1. Properties of the manifold    n. The distance measure, , based on how much the microstructures vary. Defined as the difference in statistical correlation between two microstructures. Observations: 1.( ,  ) is a metric space 2. ( ,  ) is a compact metric space 1. 1.B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics,

Materials Process Design and Control Laboratory MATHEMATICAL DETAILS 2. Mapping a compact manifold to a low-dimensional set Have no notion of the geometry of the manifold to start with. Hence cannot construct true geodesic distances! Approximate the geodesic distance using the concept of graph distance  G (i,j) : the distance of points far away is computed as a sequence of small hops. This approximation,  G, asymptotically matches the actual geodesic distance  . In the limit of large number of samples 1,2. (Theorem 4.5 in 1) 1.B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. 2.M.Bernstein, V. deSilva, J.C.Langford, J.B.Tenenbaum, Graph approximations to geodesics on embedded manifolds, Dec 2000

Materials Process Design and Control Laboratory Perform MDS on the geodesic matrix. i.e perform an eigenvalue decomposition of the squared geodesic matrix. The largest d eigenvalues are the coordinates of the N points. MATHEMATICAL DETAILS 3. MDS and choosing the dimensionality of the reduced space Estimate the dimensionality of the manifold based on a novel geometrical probability approach (developed by A. Hero et. al.) The manifold has an intrinsic dimensionality. How to choose the correct value of d? (related with issues of accuracy and computational effort) Based on ideas from graph theory. The rate of convergence of the length functional, L of the minimal spanning tree of the geodesic distance matrix is related to the dimensionality 1,2, d. 1.B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. 2.J.A.Costa, A.O.Hero, Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning, IEEE Trans. on Signal Processing, 52 (2004) with

Materials Process Design and Control Laboratory THE REDUCED ORDER STOCHASTIC MODEL Given N unordered samples N points in a low dimensional space  n. n.   d  d The procedure results in N points in a low-dimensional space. The geodesic distance + MDS step (Isomap algorithm 1 ) results in a low-dimensional convex, connected space 2,    d. 1.J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. Using the N samples, the reduced space is given as  serves as the surrogate space for . Access variability in  by sampling over . BUT have only come up with  →  map …. Need  →  map too

Materials Process Design and Control Laboratory THE REDUCED ORDER STOCHASTIC MODEL Only have N pairs to construct  →  map. Various possibilities based on specific problem at hand. But have to be conscious about computational effort and efficiency. Illustrate 3 such possibilities below. Error bounds can be computed 1.  n n   d  d  n n   d  d  n n   d  d 1. Nearest neighbor map 2. Local linear interpolation 3. Local linear interpolation with projection 1.B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.

Materials Process Design and Control Laboratory THE LOW DIMENSIONAL STOCHASTIC MODEL Given N unordered samples Compute pairwise geodesic distance Perform MDS on this distance matrix N points in a low dimensional space Algorithm consists of two parts. 1)Compute the low dimensional representation of a set of N unordered sample points belonging to a high dimensional space For using this model in a stochastic collocation framework, must sample points in  →  2) For an arbitrary point ξ €  must fins the corresponding point x € . Compute the mapping from  →   n. n.   d  d

Materials Process Design and Control Laboratory NUMERICAL EXAMPLE Problem strategy: Extract pertinent statistical information form the experimental image Reconstruct dataset of plausible 3D microstructures Construct a low dimensional parametrization of this space of microstructures Solve the SPDE for temperature evolution using this input model in a stochastic collocation framework Given an experimental image of a two-phase metal- metal composite (Silver-Tungsten composite). Find the variability in temperature arising due to the uncertainty in the knowledge of the exact 3D material distribution of the specified microstructure. T= -0.5T= S. Umekawa, R. Kotfila, O. D. Sherby, Elastic properties of a tungsten-silver composite above and below the melting point of silver J. Mech. Phys. Solids 13 (1965)

Materials Process Design and Control Laboratory Experimental image Experimental statistics GRF statistics Realizations of 3D microstructure TWO PHASE MATERIAL

Materials Process Design and Control Laboratory NON LINEAR DIMENSION REDUCTION The developments detailed before are applied to find a low dimensional representation of these 1000 microstructure samples. The optimal representation of these points was a 9 dimensional region Able to theoretically show that these points in 9D space form a convex region in  9. This convex region now represents the low dimensional stochastic input space Use sparse grid collocation strategies to sample this space.

Materials Process Design and Control Laboratory Computational domain of each deterministic problem: 65x65x65 pixels COMPUTATIONAL DETAILS The construction of the stochastic solution : through sparse grid collocation level 5 interpolation scheme used Number of deterministic problems solved: Computational platform: 50 nodes on local Linux cluster (x2 3.2 GHz) Total time: 210 minutes Total number of dof’s: 65 3 x26017 ~ 7x10 9

Materials Process Design and Control Laboratory MEAN TEMPERATURE PROFILE a b c d e f g (a)Temp contour (b-d) Temp isocontours (e-g) Temp slices

Materials Process Design and Control Laboratory HIGHER ORDER TEMPERATURE STATISTICS a (a)Temp contour (b) Temp isocontours (c) PDF of temp (d-f) Temp slices b d e f c

Materials Process Design and Control Laboratory CONCLUSIONS RELATED PUBLICATIONS: 1.B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics. 2. B. Ganapathysubramanian and N. Zabaras, "Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method", Journal of Computational Physics, in press 1.Developed an efficient data-driven non-linear model reduction technique for experimental statistics into viable stochastic input models. 2.Seamlessly meshes with any reconstruction method 3.Showcased the framework to construct a reduced model of topology of two-phase material given limited statistical data 4.This methodology has significant applications to problems where working in high dimensional spaces is computationally intractable: visualizing property evolution, process-property maps, searching and contouring