Presentation is loading. Please wait.

Presentation is loading. Please wait.

Archival research: Ungraded review questions

Similar presentations


Presentation on theme: "Archival research: Ungraded review questions"— Presentation transcript:

1 Archival research: Ungraded review questions

2 Can you explain your answer?
Use your fingers to indicate your answer: 1=A, 2=B, 3=C, 4=D. For “check all that apply,” use your both hands. After viewing the question, show me your answer in 15 seconds. Next, turn to your neighbor and you have one minute to convince him/her that you are right.

3 Every year SAS Institute, the world’s largest software company in data analytics, holds a student competition in the SAS Global Forum. The contestants can use public archival data only, rather than their own data set. Why? SAS does not believe that students are capable of collecting accurate data. SAS makes it harder to limit the number of submissions. The competition aims to test the ability of analyzing big data, and usually big data are archival data.

4 In the SAS Global Forum student paper, some required components are not found in a typical academic paper. What are they? Data analysis and generalization Data source and research problem Data cleaning and visualization Programming syntax and suggestions for future study

5 Peter, Paul, and Mary downloaded the test scores of Program of International Student Assessment (PISA) and compared test performance between the US and the UK students. This is an example of: Archival research. Meta-analysis Survey research Quasi-experiment

6 Pam Anne downloaded thirty research articles about Cognitive-Behavioral Therapy (CBT) from the library. She synthesized the findings of these studies in order to obtain a global view of the effectiveness of CBT. This is an example of: Archival research. Meta-analysis Observational study Literature review

7 What is the following is not an advantage of archival research?
Save time and money in data collection. You don’t need IRB approval. Data are accurate and data cleaning is not needed. Provides a basis for comparing the local sample against the national or even worldwide sample.

8 Which of the following is a shortcoming of archival research?
Data are inconsistent if different sources are used (e.g. different organizations might define wellbeing differently) The sample size is extremely large and most statistical software packages, such as JMP, cannot handle too many observations. The data are collected at different times (e.g. WVS has six waves and PIAAC has two rounds) and comparison across time is difficult.

9 Which of the following graphing method can be utilized to study a trend-based data set?
Bubble plot GIS Map Histogram Boxplot

10 Which of the following is NOT an archival data set for studying education and skill levels?
TIMSS PISA PIAAC HPI

11 Which of the following is NOT an archival data set for studying wellbeing?
United Nations Human Development Programme Gallup Global Wellbeing Happy Planet Index Pew Research

12 Which of the following is NOT an archival data set for studying opinions and values?
EVS WVS NORC CCMH

13 Which of the following is not assessed by PISA?
Math Science Reading Technology-based problem- solving

14 Which of the following is NOT assessed by PIAAC?
Numeracy Technological proficiency Ethical values Literacy

15 When the sample size is very large (count in thousands), how can we compare groups?
T-test ANOVA Report the confidence intervals All of the above

16 Why shouldn’t we use regression analysis when there are too many predictors (e.g. 10 or more)?
The statistical power is too high and a very trivial effect might be mis-identified as significant. Multicollinearity: Predictors are strongly correlated and the result may be accurate. The least square criterion was discovered in 1805/1809 and today we have better algorithms. B and C

17 Why shouldn’t we use regression analysis when there are too many subjects (e.g. count in thousands)?
The statistical power is too high and a very trivial effect might be mis-identified as significant. Multicollinearity: Predictors are strongly correlated The least square criterion was discovered in and today we have better algorithms. A and C

18 Which of the following is/are an ensemble method(s)?
Dragging Bagging Boosting (bootstrap forest) B and C

19 In bagging the big sample is partitioned into subsets in a systematic way.
True False

20 In boosting the observations that are mis-classified (failed to predict) in the previous model will be selected again in the next model. True False

21 In the ensemble method the algorithm can rank order the importance of the predictors by vote counting, but in regression analysis it cannot. True False

22 What is/are the advantage(s) of the ensemble method?
The traditional approach is only one single analysis but the ensemble method replicate the same study and verify the result by repeated analyses. The ensemble method can be used in both small and big samples. In classical parametric procedures the data structure must meet certain assumptions (e.g. normality) but the ensemble method does not need it. All of the above


Download ppt "Archival research: Ungraded review questions"

Similar presentations


Ads by Google