CS 594: Empirical Methods in HCC Experimental Research in HCI (Part 2)

Slides:

Advertisements

Similar presentations

Comparing Two Means Dr. Andy Field.

Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.

Hypothesis Testing Steps in Hypothesis Testing:

Chapter 16: Chi Square PSY —Spring 2003 Summerfelt.

Chi square.  Non-parametric test that’s useful when your sample violates the assumptions about normality required by other tests ◦ All other tests we’ve.

© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.

Statistical Issues in Research Planning and Evaluation

PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.

INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.

t-Tests Overview of t-Tests How a t-Test Works How a t-Test Works Single-Sample t Single-Sample t Independent Samples t Independent Samples t Paired.

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 13 Using Inferential Statistics.

PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.

Chapter 14 Inferential Data Analysis

Richard M. Jacobs, OSA, Ph.D.

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

Nonparametric or Distribution-free Tests

Inferential Statistics

AM Recitation 2/10/11.

Chapter 13: Inference in Regression

Part IV Significantly Different: Using Inferential Statistics

Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.

Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)

Chapter 16 The Chi-Square Statistic

Inference and Inferential Statistics Methods of Educational Research EDU 660.

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.

EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.

1 Nonparametric Statistical Techniques Chapter 17.

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

Experimental Design and Statistics. Scientific Method

Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.

Academic Research Academic Research Dr Kishor Bhanushali M

Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.

Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.

12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

Tuesday PM  Presentation of AM results  What are nonparametric tests?  Nonparametric tests for central tendency Mann-Whitney U test (aka Wilcoxon rank-sum.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.

Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.

Week 6 Dr. Jenne Meyer.  Article review  Rules of variance  Keep unaccounted variance small (you want to be able to explain why the variance occurs)

Chapter Fifteen Chi-Square and Other Nonparametric Procedures.

Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Chapter 13 Understanding research results: statistical inference.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &

PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT

Comparing Two Means Prof. Andy Field.

DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014

Non-Parametric Tests 12/1.

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE

Non-Parametric Tests 12/1.

Non-Parametric Tests 12/6.

Hypothesis testing. Chi-square test

Categorical Data Aims Loglinear models Categorical data

Chapter 25 Comparing Counts.

Non-Parametric Tests.

Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine

Hypothesis testing. Chi-square test

Association, correlation and regression in biomedical research

Chapter 10 Analyzing the Association Between Categorical Variables

Chi Square (2) Dr. Richard Jackson

Analyzing the Association Between Categorical Variables

Chapter 26 Comparing Counts.

Parametric versus Nonparametric (Chi-square)

Chapter 26 Comparing Counts.

InferentIal StatIstIcs

Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine

Hypothesis Testing - Chi Square

CS 594: Empirical Methods in HCC Experimental Research in HCI (Part 1)

Presentation transcript:

CS 594: Empirical Methods in HCC Experimental Research in HCI (Part 2) Dr. Debaleena Chattopadhyay Department of Computer Science debchatt@uic.edu debaleena.com hci.cs.uic.edu

Agenda Discuss course project details Revisiting Parametric Statistics Non-Parametric Statistics Categorical Data

Course Project Details Part 1 (15%) Research proposal. Research design and conceptualization of a chosen research topic. – Due 9/26 Part 2 (25%) Data analysis. Results and Discussion. --Due mid term and finals

Course Project Details (cont.…) You may deal with the same research topic for part 1 and part 2. Conceptualize, collect, and analyze data. You may use different topics for part 1 and part 2. For example, use data that you had collected before, but not analyzed (must get instructor approval beforehand)

Course Project Details (cont.…) Part 1 Scope of research Conceptualization – research questions Operationalization Metrics; what would you measure? Validity and Reliability? Hypotheses How would you collect data? How would you analyze data? Why is this methodology suitable? Explain how the data collected and anticipated results will help you answer the research questions.

Course project; Part 1 Research scope must not be trivial NO simple usability tests Your proposal will be evaluated on the following Correctness of operationalization and RQs Quality of metrics Quality of data collection plan Correctness of rationale for the chosen empirical method Degree of difficulty of the research proposal (10%) Proposals will be evaluated in two dimensions: degree of difficulty and execution appraisal

Revisiting Parametric Statistics

Q1 What does a significant test statistic tells us? There is an important effect The null hypothesis is false There is an effect in the population of sufficient magnitude to warrant interpretation All of the above

Q1 What does a significant test statistic tells us? There is an important effect The null hypothesis is false There is an effect in the population of sufficient magnitude to warrant interpretation All of the above

Q2 A Type II error is when: We conclude that there is an effect in the population when in fact there is not. We conclude that there is not an effect in the population when in fact there is. We conclude that the test statistic is significant when in fact it is not. The data we have entered in R is different that the data collected.

Q2 A Type II error is when: We conclude that there is an effect in the population when in fact there is not. We conclude that there is not an effect in the population when in fact there is. We conclude that the test statistic is significant when in fact it is not. The data we have entered in R is different that the data collected.

Q3 Which of these statements about statistical power is not true? Power is the ability of a test to detect an effect, given that an effect of a certain size exists in the population. We can use power to determine how big a sample is required to detect an effect of a certain size. Power is linked to the probability of making a Type II error. All of the above are true.

Q3 Which of these statements about statistical power is not true? Power is the ability of a test to detect an effect, given that an effect of a certain size exists in the population. We can use power to determine how big a sample is required to detect an effect of a certain size. Power is linked to the probability of making a Type II error. All of the above are true.

Q4 Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)? Some feature of the data should be normally distributed. The samples being tested should have approximately equal variances. Your data should be at least interval level. All of the above.

Q4 Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)? Some feature of the data should be normally distributed. The samples being tested should have approximately equal variances. Your data should be at least interval level. All of the above.

Q5 The Shapiro-Wilk test can be used to test: Whether data are normally distributed. Whether group variances are equal. Whether scores are measured at the interval level Whether group means differ

Q5 The Shapiro-Wilk test can be used to test: Whether data are normally distributed. Whether group variances are equal. Whether scores are measured at the interval level Whether group means differ

Q6 The correlation between two variables A and B is .12 with a significance of p <.01. What can we conclude? That there is a substantial relationship between A and B That there is a small relationship between A and B. That variable A causes variable B. All of the above.

Q6 The correlation between two variables A and B is .12 with a significance of p <.01. What can we conclude? That there is a substantial relationship between A and B That there is a small relationship between A and B. That variable A causes variable B. All of the above.

Normality

Homogeneity of variance

T-test (independent)

T-test (dependent)

Non-Parametric Statistics

When to use non-parametric tests? Data are not normally distributed Data are not measured at interval level. Non-parametric tests sometimes get referred to as distribution-free tests, with an explanation that they make no assumptions about the distribution of the data.* Technically, this isn’t true: they do make distributional assumptions (e.g., the ones in this chapter all assume a continuous distribution), but they are less restrictive ones than their parametric counterparts.

Common Non-parametric Tests in use Wilcoxon rank-sum test/ Mann–Whitney test (similar to independent t-test) Wilcoxon signed-rank test (similar to dependent t-test) Friedman’s test (similar to repeated-measures ANOVA) Kruskal–Wallis test (similar to one-way ANOVA)

Comparing two independent conditions: the Wilcoxon rank-sum test When you want to test differences between two conditions and different participants have been used in each condition then you have two choices Wilcoxon rank-sum test Mann–Whitney test

Wilcoxon rank-sum test If you have the data for different groups stored in a single column newModel<-wilcox.test(outcome ~ predictor, data = dataFrame, paired = FALSE/TRUE) if you have the data for different groups stored in two columns newModel<-wilcox.test(scores group 1, scores group 2, paired = FALSE/TRUE) outcome is a variable that contains the scores for the outcome measure (in this case drug). MM predictor is a variable that tells us to which group a score belongs (in this case sundayBDI or wedsBDI). scores group 1 is a variable that contains the scores for the first group. MM scores group 2 is a variable that contains the scores for the second group.

Example output For example, a neurologist might collect data to investigate the depressant effects of certain recreational drugs. She tested 20 clubbers in all: 10 were given an ecstasy tablet to take on a Saturday night and 10 were allowed to drink only alcohol. Levels of depression were measured using the Beck Depression Inventory (BDI) the day after and midweek.

Wilcoxon signed-rank test Used in situations in which there are two sets of scores to compare, but these scores come from the same participants. As such, think of it as the nonparametric equivalent of the dependent t-test.

Kruskal–Wallis test The one-way independent ANOVA has a non-parametric counterpart called the Kruskal–Wallis test. When the data are collected using different participants in each group, we input the data using a coding variable. So, the data editor will have two columns of data. The first column is a factor. The One-way ANOVA is also called a single factor analysis of variance because there is only one independent variable or factor.

Kruskal–Wallis test (example output)

Kruskal–Wallis test (example output)

Differences between several related groups: Friedman’s ANOVA Used for testing differences between conditions when there are more than two conditions and the same participants have been used in all conditions. If you have violated some assumption of parametric tests then this test can be a useful way around the problem.

Friedman’s ANOVA (example)

Categorical Data

Chi-square test; contingency table There is one problem with the chi-square test, which is that the sampling distribution of the test statistic has an approximate chi-square distribution. The larger the sample is, the better this approximation becomes, and in large samples the approximation is good enough to not worry about the fact that it is an approximation. However, in small samples the approximation is not good enough, making significance tests of the chi-square distribution inaccurate. This is why you often read that to use the chi-square test the expected frequencies in each cell must be greater than 5 (see section 18.5). When the expected frequencies are greater than 5, the sampling distribution is probably close enough to a perfect chisquare distribution for us not to worry. However, when the expected frequencies are too low, it probably means that the sample size is too small and that the sampling distribution of the test statistic is too deviant from a chi-square distribution to be of any use. Fisher came up with a method for computing the exact probability of the chi-square statistic that is accurate when sample sizes are small. This method is called Fisher’s exact test Therefore, we use ‘expected frequencies’. One way to estimate the expected frequencies would be to say ‘well, we’ve got 200 cats in total, and four categories, so the expected value is simply 200/4 = 50’.

Chi-square test; contingency table There is one problem with the chi-square test, which is that the sampling distribution of the test statistic has an approximate chi-square distribution. The larger the sample is, the better this approximation becomes, and in large samples the approximation is good enough to not worry about the fact that it is an approximation. However, in small samples the approximation is not good enough, making significance tests of the chi-square distribution inaccurate. This is why you often read that to use the chi-square test the expected frequencies in each cell must be greater than 5 (see section 18.5). When the expected frequencies are greater than 5, the sampling distribution is probably close enough to a perfect chisquare distribution for us not to worry. However, when the expected frequencies are too low, it probably means that the sample size is too small and that the sampling distribution of the test statistic is too deviant from a chi-square distribution to be of any use. Fisher came up with a method for computing the exact probability of the chi-square statistic that is accurate when sample sizes are small. This method is called Fisher’s exact test Therefore, we use ‘expected frequencies’. One way to estimate the expected frequencies would be to say ‘well, we’ve got 200 cats in total, and four categories, so the expected value is simply 200/4 = 50’.

Upcoming: Proposal due Sep 26, 11:59pm Start working on your annotated bibliography Post your slides on piazza after class presentations