Foundations of Data Science: Mathematics

Slides:



Advertisements
Similar presentations
Scaling Multivariate Statistics to Massive Data Algorithmic problems and approaches Alexander Gray Georgia Institute of Technology
Advertisements

Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
P. Venkataraman Mechanical Engineering P. Venkataraman Rochester Institute of Technology DETC2013 – 12269: Continuous Solution for Boundary Value Problems.
What does mean Mathematical Physics? The Journal of Mathematical Physics defines the field as: "the application of mathematics to problems in physics and.
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
Teaching Courses in Scientific Computing 30 September 2010 Roger Bielefeld Director, Advanced Research Computing.
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
©2001 CBMS Math Preparation of Teachers Teachers need to study the mathematics of a cluster of grade levels, both to be ready for the various ways in which.
Compressed Sensing for Networked Information Processing Reza Malek-Madani, 311/ Computational Analysis Don Wagner, 311/ Resource Optimization Tristan Nguyen,
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.
Condition State Transitions and Deterioration Models H. Scott Matthews March 10, 2003.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Lecture II-2: Probability Review
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Research Terminology for The Social Sciences.  Data is a collection of observations  Observations have associated attributes  These attributes are.
The Common Core State Standards emphasize coherence at each grade level – making connections across content and between content and mathematical practices.
What is Gestalt? Wiki: Gestalt refers to a structure, configuration, or pattern of physical, biological, or psychological phenomena so integrated as to.
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
Common Core Mathematics. Learning Norms 2 Restrooms Parking Lot evaluation connections General comments.
University of Colorado Denver Department of Mathematical and Statistical Sciences Graduate program in Applied Mathematics Applications Continuous models:
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Visual Analytics Research and Education Kwan-Liu Ma Department of Computer Science University of California, Davis.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling Introduction STATISTICS Introduction.
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Manifold learning: MDS and Isomap
High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,
Lecture 13  Last Week Summary  Sampled Systems  Image Degradation and The Need To Improve Image Quality  Signal vs. Noise  Image Filters  Image Reconstruction.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Bayesian Brain: Probabilistic Approaches to Neural Coding Chapter 12: Optimal Control Theory Kenju Doya, Shin Ishii, Alexandre Pouget, and Rajesh P.N.Rao.
Data Mining and Decision Support
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
COMMON CORE STANDARDS C OLLEGE - AND C AREER - READINESS S TANDARDS North East Florida Educational ConsortiumFall 2011 F LORIDA ’ S P LAN FOR I MPLEMENTATION.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Beginning 1956  Associate of Science Degree included 27 credits of mathematics  Math 12 Plane Trigonometry  Math 13 Analytical Geometry  Math 91 Calculus.
Traffic Simulation L2 – Introduction to simulation Ing. Ondřej Přibyl, Ph.D.
Mathematics Curriculum Review Presentation Gifted Advisory Committee October 11, 2016.
Department of Electrical and Computer Engineering ABET Outcomes - Definition Skills students have graduation.
Sub-fields of computer science. Sub-fields of computer science.
Tijl De Bie John Shawe-Taylor ECS, ISIS, University of Southampton
Advanced Data Analytics
Advanced Algorithms Analysis and Design
Machine Learning for Computer Security
Chapter 7. Classification and Prediction
- photometric aspects of image formation gray level images
LECTURE 11: Advanced Discriminant Analysis
Models.
Unsupervised Riemannian Clustering of Probability Density Functions
Introduction to Graphics Modeling
René Vidal Time/Place: T-Th 4.30pm-6pm, Hodson 301
Department of Eelectronic Systems
Generalization and adaptivity in stochastic convex optimization
PSG College of Technology
Machine Learning Basics
Dynamical Statistical Shape Priors for Level Set Based Tracking
Introduction Artificial Intelligent.
Nuclear Norm Heuristic for Rank Minimization
Quantum One.
Objective of This Course
UNIT-I SIGNALS & SYSTEMS.
“Hard” Optimization Problems
DIAGRAMMATIC MONTE CARLO:
Ensemble learning Reminder - Bagging of Trees Random Forest
Analytics – Statistical Approaches
Toward a Great Class Project: Discussion of Stoianov & Zorzi’s Numerosity Model Psych 209 – 2019 Feb 14, 2019.
Presentation transcript:

Foundations of Data Science: Mathematics Raphy Coifman Depts. of Mathematics & of Computer Science, Yale University Eric Kolaczyk Dept. of Mathematics & Statistics, Boston University

View @ 10K ft Common to group key players of data science into “Computational” Sciences – E.g., Computer science, engineering, & statistics (previous talks) “Domain” sciences – E.g., Genomics, neuroscience, text analysis, etc. Mathematics plays a critical … but sometimes-differentiated -- role in support of both (i.e., often in the sense of `infrastructure’).

Mathematical Infrastructure: General The “computational” side has traditionally been supported by, e.g., Linear algebra Numerical analysis Graph theory as well as supporting aspects of statistics, signal processing, etc.

Math Infrastructure: Domain Support for the “domain” side frequently is domain/problem-specific. Most representatives from the physical sciences. Provides shortcuts and enabling tools for processing and information extraction, based on mathematical physical models. Ideas about representation and data features are mostly based on mathematical analysis tools.

Math Infrastructure: Domain (cont) Example: Linear(ized) inverse problems Seismology/geophysics (Abel xform & related) Medical imaging (Radon xform, etc.) Radar Weather Note: These all historically had “big” data long before it became common in other domains.

Cartoon Version of Data Science Goals Self-organization / clustering Dimension reduction / compression Massive Data Knowledge Modeling Etc. Mathematics can contribute both theoretical models/structure and a corresponding “calculus”.

Looking Back: Core Mathematical Activities The following core activities were essential in the past to model our world, and with some adaptation and modifications are expected to still be essential in current and future environments. Linear algebra: the basics, high dimensional, and effective numerical analysis. Analysis: PDEs (both stochastic or deterministic), Harmonic analysis and approximation theory, Functional analysis.

Core Mathematical Activities (cont) Geometries: Riemannian, metric geometry , differential geometry, some topology. Optimization: Dynamic programming, convex optimization, relaxation methods. Etc.

Key Point In the current and future environment for data science we are lacking theoretical models, as well as related calculus.

Illustration: Linear Algebra Linear algebra -- more precisely, linear algebra in high dimensions -- is the main computational toolbox for all data analysis. Superficially, it has changed little in recent memory. However … the reality is that computational numerical algebra, for very large dimensions requires the integration of all the classical fields mentioned (i.e., analysis, geometry, optimization, probability, etc.)

Extending Linear Algebra The challenges of linear algebra in, e.g., high-dimensional discretized function spaces, requires a good deal of understanding of analysis, randomization, and stochastic processes to enable very high dimensional linear algebra, as it requires reorganization of massive matrices . Extensions to tensor or multilinear algebra are an essential ingredient of combined source processing.

Extending Linear Algebra: Beyond Linear The moment we exit the linear regime all expectations crumble. The simplest bilinear transformations, such as the pointwise product of two functions, become discontinuous in the weak topology, resulting potentially in major disruptions due to noise, which in the linear regime usually cancels, but builds up in nonlinear regimes. This is not academic, it requires serious deep mathematical analysis to understand empirical observations and structures in nonlinear regimes.

Mathematical Conceptualization of Modern Data Science Challenge A cloud of points in high dimensional Euclidean space (a database) that needs to be characterized or modeled. The density of the points could be estimated, the geometry or configuration of the points described. (Manifold learning, graph/networks , probabilistic generative models , metric spaces etc.) Various functions on the data , such as low dimensional embeddings, classifiers, or “features may need to be regressed or “learned”.

The Main Point wrt Data Science Education This activity involves a blend of all the (sub)fields listed above. Importantly, the corresponding mathematical abstractions usually discussed in the mathematical curriculum as independent courses (e.g., differential geometry, partial differential equations ,probability and stochastic processes, dynamical systems , optimization , Fourier analysis, etc. ) all come to life and blend in many instances of the modern setting, providing insights in all directions. This will require a blended integrative curriculum, in which all these tools are explained and jointly motivated.

Illustration: Comparing Two Random Samples A fundamental problem in statistics is to assess whether two sets of observations are random samples (a) of the same probability distribution, or (b) governed by different distributions. Canonical solution is, e.g., two-sample Kolmogorov-Smirnov test. Fundamentally, requires the creation of histogram bins. The main challenge in high dimension is to match the bins to the geometry of the data, to its precision, and to the reason for matching the two populations.

Illustration (cont) Currently soft bins are created (implicitly) through kernel functions, or use of various smoothness classes (e.g., Lipschitz functions). Unfortunately, these heuristic methods are not informed by the intrinsic data geometry/statistics. Attempts to use deep nets to define appropriate binning might work better when driven by statistical analytic tools. At this point , kernel methods or smoothness class methods are assuming conventional geometries whether they fit or not, what is required is an appropriate analytic approximation theory, governed by statistics, as well as by filtering needs.

Summary / Discussion Mathematics provides Conceptual framework Language “Calculus” for data science. Traditional framework has been found to extend well-beyond what might have been expected during its development … but we appear to be reaching a point where additional fundamental development will prove crucial.

Summary / Discussion How best to evolve the mathematics curriculum, both foundational and inter- disciplinary, to meet data science needs? How can we better foster integrative teaching/learning of topics from across traditionally diverse areas, both within mathematics (e.g., linear algebra, geometry, analysis) and across mathematical sciences (e.g., mathematics, statistics, probability)? How best to facilitate a more integrated development of theoretical error bounds / guarantees and computational / data analysis advances?