Genome 540: Discussion Section Week 3

Slides:



Advertisements
Similar presentations
GS 540 week 5. What discussion topics would you like? Past topics: General programming tips C/C++ tips and standard library BLAST Frequentist vs. Bayesian.
Advertisements

Max Cut Problem Daniel Natapov.
TEA/TUG + ALDOT(Mobile) = H(O+I) The TEA/TUG being hosted by ALDOT in Mobile causes Hurricanes to come to Alabama. The TEA/TUG being hosted by ALDOT in.
Gene Prediction: Similarity-Based Approaches (selected from Jones/Pevzner lecture notes)
CSE115: Introduction to Computer Science I Dr. Carl Alphonce 219 Bell Hall 1.
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
NP-Complete Problems Problems in Computer Science are classified into
Programming project #2 1 CS502 Spring 2006 Programming Project #3 Page Replacement Algorithms CS-502 Operating Systems Spring 2006.
Homework #5, Binary Trees CS-2301 B-term Homework #5 Binary Trees CS-2301, System Programming for Non-majors (Slides include materials from The C.
A452 – Programming project – Mark Scheme
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.
IT-101 Section 001 Lecture #3 Introduction to Information Technology.
Tasks and Training the Intermediate Age Students for Informatics Competitions Emil Kelevedjiev Zornitsa Dzhenkova BULGARIA.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Lecture 13 Graphs. Introduction to Graphs Examples of Graphs – Airline Route Map What is the fastest way to get from Pittsburgh to St Louis? What is the.
1 3-COLOURING: Input: Graph G Question: Does there exist a way to 3-colour the vertices of G so that adjacent vertices are different colours? 1.What could.
Assignment feedback Everyone is doing very well!
LEARNING HTML PowerPoint #1 Cyrus Saadat, Webmaster.
Multimedia Games Development COM429 Demo and presentation Week 12 Assignment 3.
Sequencing The most simple type of program uses sequencing, a set of instructions carried out one after another. Start End Display “Computer” Display “Science”
Chapter 8 Arrays. A First Book of ANSI C, Fourth Edition2 Introduction Atomic variable: variable whose value cannot be further subdivided into a built-in.
CO5023 Introduction to Digital Circuits. What do we know so far? How to represent numbers (integers and floating point) in binary How to convert between.
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 2 Karsten Hokamp, PhD Genetics TCD, 17/11/2015.
Quiz Week 8 Topical. Topical Quiz (Section 2) What is the difference between Computer Vision and Computer Graphics What is the difference between Computer.
HW4: sites that look like transcription start sites Nucleotide histogram Background frequency Count matrix for translation start sites (-10 to 10) Frequency.
PHP Form Processing * referenced from
1 Project 2: Sorting Cats. Write a C++ console application to read a text file containing information about cats and output the information to the screen.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
1 Project 4: Computing Distance. 222 Computing Distance Write a program to compute the distance between two points. Recall that the distance between the.
CS 403: Programming Languages Lecture 20 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
CHAPTER 3 COMPLETING THE PROBLEM- SOLVING PROCESS AND GETTING STARTED WITH C++ An Introduction to Programming with C++ Fifth Edition.
Maximal D-segments Maximal-scoring No subsegment has higher score No segment properly containing the segment satisfies the above No supersegment has higher.
ICS 353: Design and Analysis of Algorithms NP-Complete Problems King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Note: Some of the slides are repeated from Introduction to C++
Chapter 10 NP-Complete Problems.
Introduction to Programming
Homework 3 (due:May 27th) Deadline : May 27th 11:59pm
Data Structures 13th Week
Week 10.
Software Testing and Maintenance 1
Functions CIS 40 – Introduction to Programming in Python
Homework 3 (due:June 5th)
Discussion Section 3 HW1 comments HW2 questions
week 1 - Introduction Goals
PSYC 354 Enthusiastic Studysnaptutorial.com
The Taxi Scheduling Problem
Discussion section #2 HW1 questions?
First discussion section agenda
Number and String Operations
ICS 353: Design and Analysis of Algorithms
Sequence Alignment Using Dynamic Programming
Introduction to Computer Graphics
Chapter 7 Error Detection and Correction
Topics Introduction Hardware and Software How Computers Store Data
GreedyMaxCut a b w c d e 1.
Algorithm Discovery and Design
Introduction to Programming
Introduction to Digital System and Microprocessor Design
Chapter 3 DataStorage Foundations of Computer Science ã Cengage Learning.
x-Value = The horizontal value in an ordered pair or input Function = A relation that assigns exactly one value in the range to each.
Warm-Up 1) Write the Now-Next equation for each sequence of numbers. Then find the 10th term of the sequence. a) – 3, 5, 13, 21, … b) 2, – 12, 72, – 432,
Week 4 Genome 540: Discussion
Lecture 11 CSE 331 Sep 23, 2011.
Week 5 Discussion Section
Lecture 36 CSE 331 Nov 30, 2012.
Error Detection and Correction
Animated PowerPoint Template
Discussion Section Week 9
Introduction to Programming
Presentation transcript:

Genome 540: Discussion Section Week 3 Eliah Overbey

Agenda HW1 Quick Recap HW2 Questions? HW3 Introduction Programming Languages: What Do People Like? What do people use?

HW1 comments Testing Working together Formatting Test cases with known output Built-in checks (e.g. match locations for a string and its reverse complement should have the same position, on opposite strands) Working together It’s okay to compare final output, just not code Formatting Submit a plain text file (not rtf or Word) Include your name in the filename

HW2 questions? Notes: Assume that any input graph text file lists the vertices in depth order Write your representation of the graph image in depth order Make sure you write the sequence graph file in depth order How do you find the vertex at the beginning of the path? You can write separate functions for parts 1, 2, and 3 You can round any floating point numbers, but do include at least 2 decimal places! What if there are multiple highest weighted paths?

Maximal segment vs. Maximal D-segment No subsegment has a higher score No segment properly containing the segment satisfies the above condition Maximal D-segment: No subsegment has score < D, where D is the dropoff value No D-segment properly containing the D-segment satisfies the above condition The segment score must be >= S, where S >= -D

D cumulative score S find the maximal d segments sequence position

D cumulative score S sequence position

D cumulative score S sequence position

Pseudo-code for the D-segment algorithm:

D = -3 S = 3 max = 0 start = 1 end = 1 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 1 end = 1 cumul = 0

D = -3 S = 3 max = 0 start = 2 end = 2 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 2 end = 2 cumul = 0

D = -3 S = 3 max = 0 start = 2 end = 2 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 2 end = 2 cumul = 0

D = -3 S = 3 max = 0 start = 3 end = 3 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 3 end = 3 cumul = 0

D = -3 S = 3 max = 0 start = 3 end = 3 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 3 end = 3 cumul = 0

D = -3 S = 3 max = 0 start = 4 end = 4 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 4 end = 4 cumul = 0

D = -3 S = 3 max = 0 start = 4 end = 4 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 4 end = 4 cumul = 0

D = -3 S = 3 max = 0 start = 5 end = 5 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 5 end = 5 cumul = 0

D = -3 S = 3 max = 0 start = 5 end = 5 cumul = 0 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0 start = 5 end = 5 cumul = 0

D = -3 S = 3 max = 0.52 start = 5 end = 5 cumul = 0.52 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0.52 start = 5 end = 5 cumul = 0.52

D = -3 S = 3 max = 0.52 start = 5 end = 5 cumul = 0.52 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 0.52 start = 5 end = 5 cumul = 0.52

D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.62 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.62

D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.62 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.62

D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.12 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.12

D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.12 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 1.62 start = 5 end = 6 cumul = 1.12

D = -3 S = 3 max = 2.82 start = 5 end = 8 cumul = 2.82 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 2.82 start = 5 end = 8 cumul = 2.82

D = -3 S = 3 max = 2.82 start = 5 end = 8 cumul = 2.82 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 2.82 start = 5 end = 8 cumul = 2.82

D = -3 S = 3 max = 3.34 start = 5 end = 9 cumul = 3.34 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 3.34 start = 5 end = 9 cumul = 3.34

D = -3 S = 3 max = 3.34 start = 5 end = 9 cumul = 3.34 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 3.34 start = 5 end = 9 cumul = 3.34

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 4.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 4.44

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 4.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 4.44

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.94 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.94

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.94 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.94

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.44

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 3.44

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.94 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.94

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.94 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.94

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.44

D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.44 position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.44

D-segment: 5, 10, 4.44 (start, end, max) position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0 score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5 D = -3 S = 3 max = 4.44 start = 5 end = 10 cumul = 2.44 D-segment: 5, 10, 4.44 (start, end, max)

HW3 Due 11:59pm on Sunday, Feb 3 Assignment: use D-segment algorithm to identify sequence segments with high copy number. Input: Count file reporting number of read starts at each location Scoring scheme Output: Number of normal and elevated copy-number segments List of elevated copy-number segments (start, end, score) Annotations for the first three segments (look up using UCSC genome browser) Histograms of read-start counts (i.e. number of positions with 0, 1, 2, and >=3 read-starts) for non-elevated and elevated segments

Match the template exactly! When testing your code on the example, run ‘diff’ between your output and the sample output > diff your_output.txt example_output.txt The only differences should be the header. This assignment has a lot of output, so points will be deducted if you do not follow the template exactly!

Diff Example

Which Programming Language Should I Use? C++ was good for the first assignment because of its speed Should I use C++ for the rest of the class? What are common and well-liked programming languages?

Video Link: https://www.youtube.com/watch?v=cowtgmZuai0