Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

Spreadsheet Modeling & Decision Analysis
Design of Experiments Lecture I
Neural networks Introduction Fitting neural networks
Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
Component Analysis (Review)
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.
Automated Anomaly Detection, Data Validation and Correction for Environmental Sensors using Statistical Machine Learning Techniques
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Application of Statistical Techniques to Neural Data Analysis Aniket Kaloti 03/07/2006.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Statistical Decision Theory, Bayes Classifier
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Learning From Data Chichang Jou Tamkang University.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
Classification and Prediction: Regression Analysis
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Machine Learning Queens College Lecture 1: Introduction.
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
Anomaly detection with Bayesian networks Website: John Sandiford.
Copyright  2006 McGraw-Hill Australia Pty Ltd PPTs t/a Management Accounting: Information for managing and creating value 4e Slides prepared by Kim Langfield-Smith.
Software Reliability SEG3202 N. El Kadri.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Forecasting Professor Ahmadi.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Cost drivers, cost behaviour and cost estimation
Time Series Analysis and Forecasting
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Stats 845 Applied Statistics. This Course will cover: 1.Regression –Non Linear Regression –Multiple Regression 2.Analysis of Variance and Experimental.
Time Series Analysis and Forecasting. Introduction to Time Series Analysis A time-series is a set of observations on a quantitative variable collected.
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
1-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Time Series Prediction Using Support Vector Machine: A Survey By Ma Yongning.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
DEPARTMENT OF MECHANICAL ENGINEERING VII-SEMESTER PRODUCTION TECHNOLOGY-II 1 CHAPTER NO.4 FORECASTING.
Forecasting is the art and science of predicting future events.
Computacion Inteligente Least-Square Methods for System Identification.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Managerial Decision Modeling 6 th edition Cliff T. Ragsdale.
Chapter 15 Forecasting. Forecasting Methods n Forecasting methods can be classified as qualitative or quantitative. n Such methods are appropriate when.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Stats 242.3(02) Statistical Theory and Methodology.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Welcome to MM305 Unit 5 Seminar Dr. Bob Forecasting.
Welcome to MM305 Unit 5 Seminar Forecasting. What is forecasting? An attempt to predict the future using data. Generally an 8-step process 1.Why are you.
Operations Research Chapter one.
Chapter 7. Classification and Prediction
CH 5: Multivariate Methods
Analytics – Statistical Approaches
Linear Discrimination
Presentation transcript:

Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer Engineering Portland State University

Course Overview Key question: How to extract useful information from data? Some theory Mostly methods & applications Problem oriented, not technology focused Project course

Talk Overview Problem definitions Applications Project ideas Course specifics

Problem Definitions Preprocessing (briefly) –Variable selection –Dimension reduction Decision theory (hypothesis testing) Density estimation Nonlinear optimization Pattern recognition/Classification (very briefly) Nonlinear modeling (univariate & multivariate)

Variable Selection Many algorithms fail if too many inputs Often fewer inputs are sufficient due to –Redundant inputs –Irrelevant inputs Goal: Find a subset of inputs that maximizes model accuracy Is Greenspan’s BP relevant?

Dimension Reduction Redundant inputs can also be combined into a smaller composite set –Improves accuracy –Reduces computation If done well, minimal information is lost Used for signal compression Principal component analysis is most common

Dimension Reduction Example 1

Dimension Reduction Example 2

Nonlinear Optimization Find the vector a such that E(a) is minimized Many algorithms have parameters that must be “fit” to the data Usually “fit” by minimizing error measure Sometimes subject to a constraint G(a) = 0 Unconstrained optimization more common Very widely used Many engineering applications

Pattern Recognition Closely related to nonlinear modeling Goal is to identify most likely category given an input vector Equivalent to drawing decision boundaries Following example –Crab data –Four categories –Two composite inputs

Crabs Data Set

Biomedical Application Goal: identify brain cell types from microrecordings Current research project 5 categories of cell types Created metrics to characterize signals Following scatterplot shows 2 of these metrics

Neurosurgery Example

Nonlinear Modeling Given many examples of observed variables, create a model that can predict the output No other assumed knowledge Observed variables –Quantitative –Measurable

Nonlinear Modeling Observed variables may not be causal Not all causal effects are observed Model will not be perfect How do you measure how good the model is?

Smoothing For single-input single-output (SISO) systems, can plot the data Problem is to estimate a curve that most accurately predicts future points Could draw a smooth curve by hand More difficult to implement automatically More than one curve may be reasonable

Smoothing Example

Multiple “Reasonable” Solutions

Nonlinear Modeling Many methods do not work well Usually is much more difficult –Noise –Multiple inputs –Time-varying system –Small data sets Still an active area of research Will discuss "tried and true” solutions

Overview of Course Introduction & review Linear models Univariate smoothing Optimization algorithms Nonlinear modeling Pattern recognition & classification

Application Areas Engineering –Controls (system identification) –Signal processing (estimation & prediction) –Communications (channel equalization) Statistics Mathematics Computer science Systems science

Application Examples Time series prediction –Aircraft carrier landing systems Spatial Wafer Patterns Fault Detection Machinery health monitoring Automated, objective credit rating Fraud detection

Time Series Prediction

Spatial Wafer Patterns

Wafer Components

Estimation (Regression) Results

Fault Detection in Semiconductor Manufacturing

Aircraft Carrier Landing System Can be very hard –Limited visibility –Rough seas –Night Predict location at touch down –Flight deck –Aircraft Is rocking of flight deck predictable?

Machinery Health Monitoring Cost of machinery failure can be very high Recent growth in real-time monitoring –Health and Usage Monitoring Systems (HUMS) –Condition Based Maintenance (CBM) Reduce costs Increase safety

Fraud Detection Credit card fraud cost $864 million in 1992 How quickly can fraud be detected? The companies have amassed large data bases What are the patterns of fraud? Active area of research

Past Projects Many past projects –See reports & slides on the web Many time series applications –Need not be time series related Many have resulted in conference and journal publications Expect improved quality this term

Project Ideas It is up to you to identify a project Preferred –Data readily available (no new instrumentation or study design) –Independent samples (not time series data) –Engineering related –High likelihood of success (no financial forecasting)

Course Logistics Project oriented –Project reports –Must meet IEEE journal requirements –May be encouraged to publish –Brief oral slide presentation at end of term Most projects are applied May also create new methods or compare existing methods

Prerequisites Helpful –Random processes (ECE 565) –Signal processing (ECE 566) –Proficient at MATLAB or similar Required –Calculus –Probability & statistics (STAT 451) –Linear algebra (MTH 343) –Proficiency at programming