Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Curse of Dimensionality Prof. Navneet Goyal Dept. Of Computer Science & Information Systems BITS - Pilani.
Least squares CS1114
An Introduction of Support Vector Machine
PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Indian Statistical Institute Kolkata
Statistical Classification Rong Jin. Classification Problems X Input Y Output ? Given input X={x 1, x 2, …, x m } Predict the class label y  Y Y = {-1,1},
Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Data Structures and Functional Programming Algorithms for Big Data Ramin Zabih Cornell University Fall 2012.
Regression. So far, we've been looking at classification problems, in which the y values are either 0 or 1. Now we'll briefly consider the case where.
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
Evaluation.
Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.
Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.
CPSC 322, Lecture 15Slide 1 Stochastic Local Search Computer Science cpsc322, Lecture 15 (Textbook Chpt 4.8) February, 6, 2009.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Evaluation.
Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Steep learning curves Reading: DH&S, Ch 4.6, 4.5.
INSTANCE-BASE LEARNING
Clustering and greedy algorithms Prof. Noah Snavely CS1114
Nearest-Neighbor Classifiers Sec minutes of math... Definition: a metric function is a function that obeys the following properties: Identity:
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Effect Sizes, Power Analysis and Statistical Decisions Effect sizes -- what and why?? review of statistical decisions and statistical decision errors statistical.
Interpolation, extrapolation, etc. Prof. Ramin Zabih
1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class.
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Module 04: Algorithms Topic 07: Instance-Based Learning
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
GEOMETRIC VIEW OF DATA David Kauchak CS 451 – Fall 2013.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Today’s Topics HW0 due 11:55pm tonight and no later than next Tuesday HW1 out on class home page; discussion page in MoodleHW1discussion page Please do.
Image segmentation Prof. Noah Snavely CS1114
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Finding Red Pixels Prof. Ramin Zabih
UNSUPERVISED LEARNING David Kauchak CS 451 – Fall 2013.
Dividing Mixed Numbers © Math As A Second Language All Rights Reserved next #7 Taking the Fear out of Math
Medians and blobs Prof. Ramin Zabih
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Clustering Prof. Ramin Zabih
CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel:
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
CS Machine Learning Instance Based Learning (Adapted from various sources)
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
PREDICT 422: Practical Machine Learning
Lightstick orientation; line fitting
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Instance Based Learning (Adapted from various sources)
Data Mining Practical Machine Learning Tools and Techniques
Experiments in Machine Learning
Preprocessing to accelerate 1-NN
CS5112: Algorithms and Data Structures for Applications
Junheng, Shengming, Yunsheng 11/2/2018
CS5112: Algorithms and Data Structures for Applications
CS5112: Algorithms and Data Structures for Applications
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
Getting to the bottom, fast
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data

Today’s topic  How do we decide that an ECG is shockable or not? –Key to project 1 (out next week)  Simple versions first –There are whole courses on this area Machine learning –We’re going to do it correctly but informally 2

Simple example  Suppose you have a single number that summarizes an ECG –Examples: beats per minute; “normalcy”  How can you make a decision on whether to shock or not? –Depending on number, might be impossible More subtlely, you’d like to figure this out… –We’ll assume that the number has a reasonable amount of useful information A real-valued “feature” of the ECG 3

Problem setup  We’ll give you some ECG’s that are labeled as shockable and not (“training set”) –Assume 1 = shock, 0 = don’t shock –Presumably your shockable examples will tend to have a value around 1, and your non- shockable ones a value around 0  Your job: get the right answer on a new set of ECG’s (“test set”)  Q1: how can we do this?  Q2: how can we be confident we’re right? 4

Procedural approaches  There are some very simple techniques to solve these problems, which even scale up to long feature vectors –I.e., several numbers per ECG  Main example: nearest-neighbor  Algorithm: find the test data point with the most similar value to the input –Variant: majority from k nearest neighbors,  Advantage: simple, fast  Disadvantages? 5

Validation  How can we tell we have a good answer? –In the limit, we can’t, since the training data might be very different from the testing data –In practice, it is usually similar  Too little training data is a problem  Estimate confidence via leave-one-out cross validations 6

Sparsity  Especially in high dimensions, we will never have very dense data –Suppose you have 100 numbers to summarize your ECG  Why is this bad for k-NN classification?  Sometimes you have some idea about the overall shape of the solution –For example, shockable and non-shockable points should form blobs in “feature space” –This isn’t really what k-NN does 7

Statistical classification  Most work on such problems relies on statistical methods  These tend to produce clean, general and well-motivated solutions –Cost: intellectual and practical complexity  Simplest example: suppose that for shockable ECG’s we get a value around 0 and for non-shockable a value around 1 –With an infinite amount of data we’d get Gaussians centered at 1 or 0 8

Examples 9

Estimate, then query  Strategy: figure out the gaussians from the data, then use this to decide  Let’s look at the decision part first –Pretend we know the gaussians –Then decide what to do 10

Decisions 11