Presentation is loading. Please wait.

Presentation is loading. Please wait.

Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area.

Similar presentations


Presentation on theme: "Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area."— Presentation transcript:

1 Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area.

2 Overall Could be given an output from R and asked to interpret
should be able to spot which type of output it is. Could be given a Set of Data. Do you use a t-test or an ANOVA and why? Pick the test and explain why you chose it. Give your own examples, if possible, otherwise the examples in the notes are fine once you can understand and explain them. There will be no manual calculation questions in the exam. It does say on the paper that you are allowed to use a calculator – extra marks to be given to those who actually manage to incorporate the use of a calculator into the writing of their exam script.

3 Thoughts when Analysing Data
What tests would you use? What are the conclusions you draw? Most interested in the explanation. How would you write a paper etc…

4 Fisher F-test Equality of Variance. Why look for it?
Assumptions behind the test. 2 Assumptions: equal variance and unequal variance. When would you test for equality of variance? Give example.

5 Multiple Linear Regression
What do you understand about it? What does it do? Give a real world example. How would you explain it to a customer? Explain about a model, how one is built etc. Will not be asked to build a model.

6 Multi Colinearity Multiple linear regression.
Looking for correlations between variances. Don’t want high correlations between x values – messes things up in terms of prediction and building a model. Perfect correlation – therefore statistics packages may not be able to build a suitable model.

7 ANOVA How do you formulate the null and alternative hypotheses?
We are not saying that all means are different – we are saying that at least 2 are. Which Groups Differ? Post Hoc Testing - TukeyHSD test. Look at p-values. Why is this test done? ANOVA does not tell which groups differ.

8 2-Way ANOVA Extension of 1-way ANOVA – addition of a factor, interaction effect. How do you interpret? Draw examples using plots – lines crossing – if lines are parallel there is no interaction. Explore and explain interactions and interaction effects – focus on interactions if significant. Why do a 2-way ANOVA?

9 Anscombe quartet No specific questions but there may be a sub question. E.g. if you plot the data what if there is high correlation?

10 Quantile-Quantile (qq)
Shown a plot – interpret it. What is a qq plot and what is it used for? Testing for Normality of Distribution. Shapiro-Wilkes test – what is the null hypothesis – different to most null hypotheses we have looked at – looking for a high p-value as opposed to a low p-value etc.

11 Non Parametric Name non parametric alternatives to usual parametric tests. E.g. ANOVA vs C-Wallace. What are the alternatives used for? E.g. Mann Whitney test – what are the advantages and disadvantages? Assumption of normal distribution. Assumption that population variances are equal. Heterogeneity and homogeneity of variance.

12 Factor Analysis & Principal Component Analysis
What is a latent (hidden) variable? What is a component and a factor? Theories behind factor analysis. Explain factor analysis – what questions relate to each other etc.? What is a scree plot (no. of components, factors)? Eigenvalues. Principal component analysis – regarding dimension analysis. Write the stages for exploratory factor analysis

13 Sampling Distribution
Sampling distribution represents null hypothesis. Calculating statistics from samples. Sampling error.

14 Confidence Interval What is a confidence interval
Explain why the low-high grows the more confident I predict

15 Effect Size Compare means of two groups over a standard deviation – Cohen’s D. Why is this important? What influence does sample size have? What influence does effect size have?

16 Power What is it? We usually assume that the null hypothesis is true.
What if the alternative hypothesis is true and what does this mean in terms of the likelihood of finding something and significance? Before carrying out statistical work based on effect size and sample size, what is the power? Maybe unable to find what you want or something significant – low potential of succeeding – may need to change sample size etc.

17 Machine Learning What is machine learning? For each algorithm know its definition and an example of where to use it, with sample R for more marks Linear regression Logistic Regression K-Means Clustering K-Nearest Neighbors (KNN) Classification Naive Bayes Classification Decison Trees Support Vector Machine (SVM) Artifical Neural Network (ANN) Apriori AdaBoost

18 Time Series A little book of time series in R Trend component
Season component Irregular component Additive model Normal distribution Box test


Download ppt "Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area."

Similar presentations


Ads by Google