Genome Sciences 373 Genome Informatics Quiz Section 5 April 28, 2015.

Slides:



Advertisements
Similar presentations
Loops (Part 1) Computer Science Erwin High School Fall 2014.
Advertisements

1 Test a hypothesis about a mean Formulate hypothesis about mean, e.g., mean starting income for graduates from WSU is $25,000. Get random sample, say.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
CS150 Introduction to Computer Science 1
Intro to Statistics for the Behavioral Sciences PSYC 1900
Mean Comparison With More Than Two Groups
Chi-square Test of Independence
Two-Way ANOVA in SAS Multiple regression with two or
Genome Sciences 373 Genome Informatics Quiz Section 2 April 7, 2015.
Main task -write me a program
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
STAT 3130 Statistical Methods II Session 6 Statistical Power.
Genome Sciences 373 Genome Informatics Quiz Section 5 April 28, 2015.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
Genome Sciences 373 Genome Informatics Quiz Section 4 April 21, 2015.
Motif search and discovery Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington.
Functions.
Multiple testing correction
Hypothesis Testing.
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
General Programming Introduction to Computing Science and Programming I.
Factorial Notation For any positive integer n, n! means: n (n – 1) (n – 2)... (3) (2) (1) 0! will be defined as equal to one. Examples: 4! = =
Motif search Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington
ANOVA (Analysis of Variance) by Aziza Munir
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Comp. Genomics Recitation 3 The statistics of database searching.
Q and A for Sections 2.9, 4.1 Victor Norman CS106 Fall 2015.
Structured Programming: Debugging and Practice by the end of this class you should be able to: debug a program using echo printing debug a program using.
Ch. 10 For Statement Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2012.
Statistics 101 Chapter 10 Section 2. How to run a significance test Step 1: Identify the population of interest and the parameter you want to draw conclusions.
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Lecture 5: Stopping with a Sentinel. Using a Sentinel Problem Develop a class-averaging program that will process an arbitrary number of grades each time.
Chi Squared Test for Independence. Hypothesis Testing Null Hypothesis, – States that there is no significant difference between two (population) parameters.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
AP Statistics Chapter 21 Notes
AGENDA Review In-Class Group Problems Review. Homework #3 Due on Thursday Do the first problem correctly Difference between what should happen over the.
1 CS 177 Week 6 Recitation Slides Review for Midterm Exam.
Think First, Code Second Understand the problem Work out step by step procedure for solving the problem (algorithm) top down design and stepwise refinement.
Type I and Type II Errors. For type I and type II errors, we must know the null and alternate hypotheses. H 0 : µ = 40 The mean of the population is 40.
HW4: sites that look like transcription start sites Nucleotide histogram Background frequency Count matrix for translation start sites (-10 to 10) Frequency.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
The Idea of the Statistical Test. A statistical test evaluates the "fit" of a hypothesis to a sample.
PYTHON FUNCTIONS. What you should already know Example: f(x) = 2 * x f(3) = 6 f(3)  using f() 3 is the x input parameter 6 is the output of f(3)
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Independent Samples ANOVA. Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.The Equal Variance Assumption 3.Cumulative.
More about tests and intervals CHAPTER 21. Do not state your claim as the null hypothesis, instead make what you’re trying to prove the alternative. The.
Introduction to programming in java Lecture 21 Arrays – Part 1.
CMSC201 Computer Science I for Majors Lecture 05 – Comparison Operators and Boolean (Logical) Operators Prof. Katherine Gibson Prof. Jeremy.
I. CHI SQUARE ANALYSIS Statistical tool used to evaluate variation in categorical data Used to determine if variation is significant or instead, due to.
Introduction to Computing Science and Programming I
Pairwise sequence comparison
Escape sequences: Practice using the escape sequences on the code below to see what happens. Try this next code to help you understand the last two sequences.
Teaching London Computing
Iteration: Beyond the Basic PERFORM
What does this do? def revList(L): if len(L) < = 1: return L x = L[0] LR = revList(L[1:]) return LR + x.
Introduction to Programming
P-VALUE.
One Way ANOVAs One Way ANOVAs
Suppose I want to add all the even integers from 1 to 100 (inclusive)
Sequence comparison: Multiple testing correction
Chapter 7: Statistical Issues in Research planning and Evaluation
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Errors.
Inference as Decision Section 10.4.
Introduction to Programming
Adding variables. There is a difference between assessing the statistical significance of a variable acting alone and a variable being added to a model.
IST256 : Applications Programming for Information Systems
Lecture 6 - Recursion.
Presentation transcript:

Genome Sciences 373 Genome Informatics Quiz Section 5 April 28, 2015

Bonferroni corrections The motivation is to minimize the probability of a single false-positive test (Type I Error) We often define alpha = 0.05 == 5% chance we reject the null hypothesis even when it’s true

Bonferroni corrections The motivation is to minimize the probability of a single false-positive test (Type I Error) We often define alpha = 0.05 == 5% chance we reject the null hypothesis even when it’s true The more tests we do, the probability grows quickly!

Bonferroni corrections Bonferroni is very conservative – we will lose some “true signal” in order to not have false-positives

Bonferroni corrections Input: list of p-values and alpha for one test Output: list of p-values below a corrected threshold of significance Procedure: - Count ALL of the p-values - Divide alpha by the total count - Output each p-value if it is less than new_alpha

Some python caveats You have two sequences (s1, s2) You want to look for a gap (“-”) in the alignment. for index in range(len(s1)): if s1[index] or s2[index] == “-”: print “found a gap!” ???

Some python caveats You have two sequences (s1, s2) You want to look for a gap (“-”) in the alignment. for index in range(len(s1)): if s1[index] or s2[index] == “-”: print “found a gap!” XXXXX

Some python caveats You have two sequences (s1, s2) You want to look for a gap (“-”) in the alignment. for index in range(len(s1)): if s1[index] == “-” or s2[index] == “-”: print “found a gap!”

Another python caveat You are reading two sequences (s1, s2) from an alignment file You want to check each position of the alignment for some condition my_open_file = open(sys.argv[1]) s1 = my_open_file.readline() s2 = my_open_file.readline() for index in range(len(s1)): if s1[index] == s2[index]: number_of_matches += 1

Another python caveat You are reading two sequences (s1, s2) from an alignment file You want to check each position of the alignment for some condition my_open_file = open(sys.argv[1]) s1 = my_open_file.readline() s2 = my_open_file.readline() for index in range(len(s1)): if s1[index] == s2[index]: number_of_matches += 1 XXXXX

Another python caveat You are reading two sequences (s1, s2) from an alignment file You want to check each position of the alignment for some condition my_open_file = open(sys.argv[1]) s1 = my_open_file.readline().strip() s2 = my_open_file.readline().strip() for index in range(len(s1)): if s1[index] == s2[index]: number_of_matches += 1

Functions in Python: a brief overview Functions are: reusable pieces of code, that take zero or more arguments, perform some actions, and return one or more values

Functions in Python: a brief overview Functions are: reusable pieces of code, that take zero or more arguments, perform some actions, and return one or more values function “sum” takes arguments a, b adds a and b returns sum conceptually

Functions in Python: a brief overview Functions are: reusable pieces of code, that take zero or more arguments, perform some actions, and return one or more values function “sum” takes arguments a, b adds a and b returns sum conceptually def sum(a, b): total = a + b return total # later in the program my_sum = sum(2, 5) # my_sum is now 7 in python…

Functions in Python: a brief overview Functions are: reusable pieces of code, that take zero or more arguments, perform some actions, and return one or more values def sum(a, b): total = a + b return total # later in the program my_sum = sum(2, 5) print total # this won’t work! in python… stuff that happens in here is invisible outside of the function

function to find Jukes-Cantor distance

In-class example: Write a function to calculate the factorial of an integer