8/22/2019 Exercise 1 In the ISwR data set alkfos, do a PCA of the placebo and Tamoxifen groups separately, then together. Plot the first two principal.

Slides:



Advertisements
Similar presentations
Data analysis Lecture 10 Tijl De Bie.
Advertisements

Exercise 1 In the ISwR data set alkfos, do a PCA of the placebo and Tamoxifen groups separately, then together. Plot the first two principal components.
SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data.
SPH 247 Statistical Analysis of Laboratory Data. Supervised and Unsupervised Learning Logistic regression and Fisher’s LDA and QDA are examples of supervised.
Chapter 17 Overview of Multivariate Analysis Methods
Multivariate Methods Pattern Recognition and Hypothesis Testing.
1 Cluster Analysis EPP 245 Statistical Analysis of Laboratory Data.
New Methods in Ecology Complex statistical tests, and why we should be cautious!
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
SPH 247 Statistical Analysis of Laboratory Data. Cystic Fibrosis Data Set The 'cystfibr' data frame has 25 rows and 10 columns. It contains lung function.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
CSE 300: Software Reliability Engineering Topics covered: Software metrics and software reliability Software complexity and software quality.
1 Cluster Analysis EPP 245 Statistical Analysis of Laboratory Data.
1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.
Discriminant Analysis Testing latent variables as predictors of groups.
Factorial Designs - 1 Intervention studies with 2 or more categorical explanatory variables leading to a numerical outcome variable are called Factorial.
STUDENTLIFE PREDICTIVE MODELING Hongyu Chen Jing Li Mubing Li CS69/169 Mobile Health March 2015.
Factor Analysis Psy 524 Ainsworth.
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
SPH 247 Statistical Analysis of Laboratory Data May 26, 2015SPH 247 Statistical Analysis of Laboratory Data1.
230 Jeopardy Unit 4 Chi-Square Repeated- Measures ANOVA Factorial Design Factorial ANOVA Correlation $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500.
Statistical Modeling with SAS/STAT Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 9, 2015.
Much of the meaning of terms depends on context. 1.
Today Ensemble Methods. Recap of the course. Classifier Fusion
LINEAR CLASSIFICATION METHODS STAT 597 E Fengjuan Xuan Caimiao Wei Bogdan Ilie.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Multivariate Data Analysis Chapter 1 - Introduction.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Machine Learning with Discriminative Methods Lecture 05 – Doing it 1 CS Spring 2015 Alex Berg.
Subjects Review Introduction to Statistical Learning Midterm: Thursday, October 15th :00-16:00 ADV2.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
SPH 247 Statistical Analysis of Laboratory Data. Binary Classification Suppose we have two groups for which each case is a member of one or the other,
Scikit-Learn Intro to Data Science Presented by: Vishnu Karnam A
Blackbox classifiers for preoperative discrimination between malignant and benign ovarian tumors C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel.
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Revision Questions Experimentation. 2 Explain Independent variable The variable that is changed by the person doing the experiment. Remember: If I am.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh.
 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.
SPH 247 Statistical Analysis of Laboratory Data. Supervised and Unsupervised Learning Logistic regression and Fisher’s LDA and QDA are examples of supervised.
JMP Discovery Summit 2016 Janet Alvarado
Outlier Detection Identifying anomalous values in the real- world database is important both for improving the quality of original data and for reducing.
Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area.
Statistical Techniques
EPP 245/298 Statistical Analysis of Laboratory Data
Experimental Design.
Chapter 6 Predicting Future Performance
Overview of Supervised Learning
CHAPTER 10 Correlation and Regression (Objectives)
Employee Turnover: Data Analysis and Exploration
Multivariate statistics
Classification Discriminant Analysis
Example of PCR, interpretation of calibration equations
Types of Control I. Measurement Control II. Statistical Control
Introduction PCA (Principal Component Analysis) Characteristics:
2/28/2019 Exercise 1 In the bcmort data set, the four-level factor cohort can be considered the product of two two-level factors, say “period” (
Chapter 6 Predicting Future Performance
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Machine Learning – a Probabilistic Perspective
Cases. Simple Regression Linear Multiple Regression.
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Linear regression with one variable
Measurement System Analysis
Much of the meaning of terms depends on context.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
Unsupervised principal-component analysis (PCA) of 129 seroreactive antigens for adults and children from different geographical locations. Unsupervised.
Presentation transcript:

8/22/2019 Exercise 1 In the ISwR data set alkfos, do a PCA of the placebo and Tamoxifen groups separately, then together. Plot the first two principal components of the whole group with color coding for the treatment and control subjects. Conduct a linear discriminant analysis of the two groups using the 7 variables. How well can you predict the treatment? Is this the usual kind of analysis you would see? Use logistic regression to predict the group based on the measurements. Compare the in-sample error rates. Use cross-validation with repeated training subsamples of 38/43 and test sets of size 5/43. What can you now conclude about the two methods? May 26, 2015 SPH 247 Statistical Analysis of Laboratory Data

Exercise 2 In the ISwR data set alkfos, cluster the data based on the 7 measurements using hclust(), kmeans(), and Mclust(). Compare the 2-group clustering with the placebo/Tamoxifen classification. May 26, 2015 SPH 247 Statistical Analysis of Laboratory Data