Week 10.

Slides:



Advertisements
Similar presentations
Microsoft® Small Basic
Advertisements

Alford Academy Business Education and Computing1 Advanced Higher Computing Based on Heriot-Watt University Scholar Materials File Handling.
Uniq command 6/12/2015Gary DeRoest1 report or filter out repeated lines in a file Note: the file needs to be sorted so that repeated lines are adjacent.
Version Control Systems Phil Pratt-Szeliga Fall 2010.
Methods: a first introduction Two ways to make tea: 1: boil water; put teabag into cup;... etc : tell your younger brother: makeTea(1 milk, 0 sugar);
Basic Unix Dr Tim Cutts Team Leader Systems Support Group Infrastructure Management Team.
MATLAB File Management. MATLAB User File Management Matlab provides a group of commands to manage user files. For more information, type help iofun. pwd.
Fruitful functions. Return values The built-in functions we have used, such as abs, pow, int, max, and range, have produced results. Calling each of these.
Introduction to Version Control with SVN & Git CSC/ECE 517, Fall 2012 Titus Barik & Ed Gehringer, with help from Gaurav.
Lesson 7-Creating and Changing Directories. Overview Using directories to create order. Managing files in directories. Using pathnames to manage files.
Testing. Definition From the dictionary- the means by which the presence, quality, or genuineness of anything is determined; a means of trial. For software.
PCSpim How to Program ?. Some Resource There are some useful online document! You can find the links on our TAs’ website. tw/~xdd/Arc06/
Chapter 3: Completing the Problem- Solving Process and Getting Started with C++ Introduction to Programming with C++ Fourth Edition.
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
Vim Editor and Unix Command gcc compiler Computer Networks.
Python From the book “Think Python”
 Review  Created our own motion block called “draw square”  Used script to create a square with side lengths of 100 steps.
CS 320 Assignment 1 Rewriting the MISC Osystem class to support loading machine language programs at addresses other than 0 1.
Principles of Computer Science I Honors Section Note Set 1 CSE 1341 – H 1.
Making Python Pretty!. How to Use This Presentation… Download a copy of this presentation to your ‘Computing’ folder. Follow the code examples, and put.
12 CVS Mauro Jaskelioff (originally by Gail Hopkins)
Introduction to Unix (CA263) Command File By Tariq Ibn Aziz.
Test Automation For Web-Based Applications Portnov Computer School Presenter: Ellie Skobel.
Weekly Report By Gabriella Suarez 06/08/2015. Goals Normal/Abnormal- The normal/abnormal team needed to meet to discuss the implementation of this project.
Software. Introduction n A computer can’t do anything without a program of instructions. n A program is a set of instructions a computer carries out.
9NL Ayomi Hasenclever.  You cant touch a software  It is stored in a computer or laptop  Allows the hardware to do something useful, without the software.
Unix tools Regular expressions grep sed AWK. Regular expressions Sequence of characters that define a search pattern banana matches the text banana
Training Deck Microsoft Corporation Store, sync, and share your work files.
Source Control Dr. Scott Schaefer. Version Control Systems Allow for maintenance and archiving of multiple versions of code / other files Designed for.
W4118 Operating Systems Junfeng Yang. What this course is about  Fundamental OS concepts  OS: one of the most crucial, almost everything thru OS  What?
Nat 4/5 Computing Science Software
Development Environment
5.01 Understand Different Types of Programming Errors
Everything you need to know!
Module 2: Conditionals and Logical Equivalences
Topics discussed in this section:
Source Control Dr. Scott Schaefer.
Discussion 11 Final Project / Git.
BA 201 Lab 5.
Introduction to Operating Systems
CS 326 Programming Languages, Concepts and Implementation
Programming Tips GS540 January 10, 2011.
Digital Speech Processing
Discussion Section 3 HW1 comments HW2 questions
Exceptions and files Taken from notes by Dr. Neil Moore
Changing WRF input files (met_em…) with MATLAB
User Defined Functions
Selenium HP Web Test Tool Training
Computer Science 2 Hashing
Programming Logic and Design Fourth Edition, Comprehensive
Finding a Eulerian Cycle in a Directed Graph
Exceptions and files Taken from notes by Dr. Neil Moore
CS005 Introduction to Programming
Programming Tips GS540 January 10, 2011.
Electronics II Physics 3620 / 6620
Introduction to Python: Day Three
Fundamentals of Python: First Programs
Computer Science 2 Hashing.
Files.
#in-class Take a look at the A2 assignment description Respond to the poll on #in-class Post questions you have about A2 on #in-class, or emoji a question.
Systems Analysis and Design I
Module 6 Working with Files and Directories
For Tutors Introduce yourself.
Genome 540: Discussion Section Week 3
Discussion Section Week 9
Arrays.
CSC 221: Introduction to Programming Fall 2018
Topics discussed in this section:
Presentation transcript:

Week 10

Homework 9 Use D-segment algorithm to find CNVs. Input: Number of read starts at each genomic position (1,2,>=3). Use a Poisson model of read counts given copy number.

Poisson distribution Probability of observing k counts given a mean of λ counts: Probability of observing 3 or more counts:

Score Emission probability of r reads: Score associated with being in CNV given r observed reads:

D-segment algorithm cumul = max = 0; start = 1 for (i = 1..N) { cumul += score[i] if cumul ≥ max: max = cumul; end = i if (cumul ≤ 0) or (cumul ≤ max) or (i == N) { if max ≥ S: output(start, end, max) max = cumul = 0; start = end = i+1 }

How to organize a computational biology project

Principles Someone unfamiliar with your project should be able to understand what you did and why. Everything you do, you will have to do over again.

How not to organize a project source/ <big, complicated program> tests/

Files and directories

Carrying out a single experiment A single driver script should carry out a full experiment. The driver script should take no arguments. Avoid editing intermediate files by hand. Store all file and directory paths in the driver script. Use relative paths. Make the script restartable: if (<output file does not exist>) then <perform operation>

Handling errors Check for errors whenever possible. When an error occurs, abort. Create each output file using a temporary name, then rename the file when it is complete.

File and directory names <id>_<date>_<brief description> Example: 05_2015-03-12_logistic_regression

The information in a filename is contained in both the filename and its path Bad: predict_gene_expression/predict_gene_expression_using_logistic_regression/predict_gene_expression_using_logistic_regression_test_using_alpha=1 Good: predict_gene_expression/logistic_regression/alpha=1

Source directories Include only mature code with a defined specification. Bad: predict_gene_expression(histone mods) Okay: optimize_logisitic_regression_using_gradient_descent(features, labels) Don't be afraid to copy/paste code between experiment directories.

Version control Check in every hour or so, so you can roll back bad changes. Check in any and only files that you have edited by hand.