Ungraded quiz Unit 5.

Slides:



Advertisements
Similar presentations
PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Advertisements

Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Visualizing and Exploring Data Summary statistics for data (mean, median, mode, quartile, variance, skewnes) Distribution of values for single variables.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Principal Components. Karl Pearson Principal Components (PC) Objective: Given a data matrix of dimensions nxp (p variables and n elements) try to represent.
Correlation MARE 250 Dr. Jason Turner.
Lecture 6: Multiple Regression
Multiple Regression MARE 250 Dr. Jason Turner.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
Outliers and Influential Data Points in Regression Analysis James P. Stevens sujin jang november 10, 2008.
POSTER TEMPLATE BY: Cluster-Based Modeling: Exploring the Linear Regression Model Space Student: XiaYi(Sandy) Shen Advisor:
Lesson 22 Graphics Software. This lesson includes the following sections: Paint Programs Photo-Manipulation Programs Draw Programs Computer-Aided Design.
Slide 1 Detecting Outliers Outliers are cases that have an atypical score either for a single variable (univariate outliers) or for a combination of variables.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Multivariate Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Computer Aided Thermal Fluid Analysis Lecture 11 Dr. Ming-Jyh Chern ME NTUST.
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables.
Anomaly detection with Bayesian networks Website: John Sandiford.
Basic concepts in ordination
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Data Mining Manufacturing Data Dave E. Stevens Eastman Chemical Company Kingsport, TN.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
Copyright © 2008, SAS Institute Inc. All rights reserved. Interactive Analysis and Data Visualization Using JMP −Dara Hammond, Federal Systems Engineer.
In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Assumptions 5.4 Data Screening. Assumptions Parametric tests based on the normal distribution assume: – Independence – Additivity and linearity – Normality.
Dancing with the data Chong Ho Yu (Alex). Agenda Difference between static and dynamic graphics Visualization techniques from 1 to 5 dimensions Future.
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
Clustering / Scaling. Cluster Analysis Objective: – Partitions observations into meaningful groups with individuals in a group being more “similar” to.
Correlations: Linear Relationships Data What kind of measures are used? interval, ratio nominal Correlation Analysis: Pearson’s r (ordinal scales use Spearman’s.
1-2 What is the Matlab environment? How can you create vectors ? What does the colon : operator do? How does the use of the built-in linspace function.
Limn: Using Image and Video Technology for Visualizing a Million Cases of Multivariate data Di Cook, Les Miller, Manuel Suarez, Peter Sutherland, Jing.
Exploring Data: Summary Statistics and Visualizations
Lesson 22 Graphics Software.
Exploring Microarray data
Scatter Plots and Association
Predicting the Market Value of the Property Using JMP® Pro 11
Ungraded quiz Unit 6.
Ungraded quiz Unit 3.
Lecture 14 Review of Lecture 13 What we’ll talk about today?
Ungraded quiz Unit 1.
Ungraded quiz Unit 7.
Ungraded quiz Unit 5.
Ungraded quiz Unit 4.
Covariance Vs Correlation Matrix
Checking the data and assumptions before the final analysis.
Scatter Plots Unit 11 B.
Lesson 22 Graphics Software.
Association between 2 variables
Machine Learning – a Probabilistic Perspective
Unit 5 Quiz: Review questions
Multi-dimensional data visualization
Ungraded quiz Unit 7.
Ungraded quiz Unit 6.
Ungraded quiz Unit 6.
Ungraded quiz Unit 7.
Ungraded quiz Unit 3.
Ungraded quiz Unit 10.
Ungraded quiz Unit 1.
Ungraded quiz Unit 4.
Creating and interpreting scatter plots
Ungraded quiz Unit 11.
Ungraded quiz Unit 1.
Ungraded quiz Unit 3.
Ungraded quiz Unit 9.
Ungraded quiz Unit 8.
Presentation transcript:

Ungraded quiz Unit 5

Show me your fingers Do not shout out the answer, or your classmates will follow what you said. Use your fingers One finger (the right finger) = A Two fingers = B Three fingers = C Four fingers = D No finger = I don’t know. I didn’t study

When there are too many variables, what can be done? Use principal component analysis to combine variables into a few composite scores Use stepwise regression to toss out unwanted variables Use predictor screening to toss out unwanted variables All of the above

Coplot can be implemented in _____ JMP Tableau Both

Which of the following statements is true? In Tableau Dimensions are continuous variables and Measures are ordinal. Dimensions are categorical and Measures are ordinal. Dimensions are categorical and Measures are continuous. Both Dimensions and Measures can accept any data type.

He should use Mahalanobis distance or/and normal contour ellipsoids. Dr. April Fu used a boxplot and a 95% density eclipse to detect and remove outliers. And then he ran a multivariate analysis with 4 variables. Which of the following statements is true? Using a box plot is inappropriate because this method is for detecting one-dimensional outlier. Using a density eclipse is inappropriate because this method is for detecting bivariate outlier. He should use Mahalanobis distance or/and normal contour ellipsoids. All of the above

Which of the following graphs can be used for clustering of a 3-dimensional data set? Scatterplot matrix Coplot Terney plot All of the above

I want to include a graph into a poster presentation I want to include a graph into a poster presentation. The graph size will be 20 X 30. Which graphical format should I use? JPEG PNG GIF Vector-based image

I want to find out what is the optimal combination of various conditions to obtain the desirable outcome. Which visualization method should I use? Linking and brushing Color map Prediction profiler Mesh surface

I used Mathematica to create a movie showing 4 dimensions I used Mathematica to create a movie showing 4 dimensions. When I played the movie, the change from one frame to the next is almost unnoticeable. What is happening? You didn’t pay the software license Your computer microprocessor is not powerful enough to display the animation. There is no 4-way interaction.