Ungraded quiz Unit 1
Show me your fingers Do not shout out the answer, or your classmates will follow what you said. Use your fingers One finger (the right finger) = A Two fingers = B Three fingers = C Four fingers = D No finger = I don’t know. I didn’t study
What is the logic of conventional hypothesis testing? Given the data, what is the best hypothesis or hypothesis that can explain the data or the phenomenon? Given the hypothesis, what is the probability of observing the data at hand without invoking a sampling distribution? Given the hypothesis, what is the probability of observing the data in the long run, as expressed in terms of sampling distribution? Given the data, what is the pattern and the magnitude of the effect?
What is the fallacy of “if A then B; B is observed, A is proven”? Empirical adequacy Affirming the consequent Absence of evidence is equated with evidence of absence Confirmation bias
The current form of hypothesis testing is a fusion of the research traditions introduced by: R. A. Fisher and Edgar Pearson R. A. Fisher, Jerzy Neyman, and Edgar Pearson R. A. Fisher and Jerzy Neyman R. A. Fisher, Jerzy Neyman, and Karl Pearson
Which of the following is/are considered a shortcoming/shortcomings of conventional hypothesis testing: Easy to reject the null hypothesis The alpha level is arbitrary Lack reproducibility All of the above
Which of the following is not an advantage of data mining? Most data mining methods are non-parametric It is data-driven and exploratory, and thus it reduces confirmation bias It uses resampling-based ensemble methods rather than counting on one single analysis Unlike the alpha level, the absolute cut-off of data mining methods (e.g. AIC, BIC) is more objective.
What are the shortcomings of conventional data? Sampling bias (e.g. WEIRD) In experimental setting the participants might behave according to social desirability (e.g. the dictator game) In survey the respondents might not be able to recall the exact information. All of the above
Which of the following statement is untrue? By utilizing Google search-term data, the author of “Everybody lies” found that many Americans lied about their criteria for voting By utilizing Google search-term data, Google researchers developed a highly accurate model for predicting the outbreak of influenza. Two researchers claimed that a personality model based on Facebook data can accurately predict what people like. FTC is investigating Facebook because Facebook handed over data to Cambridge Analytica without notifying Facebook users.
Which of the following is not a component of data science? Data architecture Data acquisition Data analysis Data ethics
Which of the following is not a characteristic of big data? High volume High velocity High variety High vectority
Which of the following about data mining is untrue? Data mining includes mining structured numeric data and unstructured textual data Data mining utilizes machine learning Data mining necessitates pattern seeking Data mining is so named because the insight is buried and thus it takes filtering and extraction to learn about the data.