Welcome to Introduction to Bioinformatics Computing aka BIC1.

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

In-class activities Sat and Sun Tuesday Thursday Wednesday Friday Monday out-of-class activities Protein Module * ** * * -- !! -- / * = clicker questions.
A Lite Introduction to (Bioinformatics and) Comparative Genomics Chris Mueller August 10, 2004.
Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology.
HCS806 “Methods in Horticulture and Crop Science” Introduction to methods in Bioinformatics for plant science. David Francis (Coordinator) Ian Holford.
Welcome to Chem 434 Bioinformatics Sept 20, 2012 Review of course prerequisites Review of syllabus Review of CSULA Bioinformatics Course website.
Prof. Drs. Sutarno, MSc., PhD.. Biology is Study of Life Molecular Biology  Studying life at a molecular level Molecular Biology  modern Biology The.
Bioinformatics What is bioinformatics? Why bioinformatics? The major molecular biology facts Brief history of bioinformatics Typical problems of bioinformatics:
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Introduction to Bioinformatics Yana Kortsarts Bob Morris.
Introduction to Genetics A.Definition of “Genetics” B.Proteins C.Nucleic Acids D.The Central Dogma of Genetics E.Historical Perspective.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
June 13, Introduction to CS II Data Structures Hongwei Xi Comp. Sci. Dept. Boston University.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
1 Algorithms in Computational Biology (236522) Spring 2006 Lecturer: Golan Yona Office hours: Wednesday or Thursday 2-3pm (Taub 632, Tel 4356) TA: Itai.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 18: Application-Driven Hardware Acceleration (4/4)
Recap Sometimes it is necessary to conduct Bad Science – often the product of having too much information Human Genome Project changed natural scientists.
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
Cbio course, spring 2005, Hebrew University Computational Methods In Molecular Biology CS-67693, Spring 2005 School of Computer Science & Engineering Hebrew.
Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.
EECS 395/495 Algorithmic Techniques for Bioinformatics General Introduction 9/27/2012 Ming-Yang Kao 19/27/2012.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Cpt S 471/571: Computational Genomics Spring 2015, 3 cr. Where: Sloan 9 When: M WF 11:10-12:00 Instructor weekly office hour for Spring 2015: Tuesdays.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
CSE 6406: Bioinformatics Algorithms. Course Outline
Welcome to Introduction to Bioinformatics Computing aka BIC1.
BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Intelligent systems in bioinformatics Introduction to the course.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Bioinformatics: Theory and Practice – Striking a Balance (a plea for teaching, as well as doing, Bioinformatics) Practice (Molecular Biology) Theory: Central.
Introduction to Bioinformatics Biostatistics & Medical Informatics 576 Computer Sciences 576 Fall 2008 Colin Dewey Dept. of Biostatistics & Medical Informatics.
Bioinformatics Computing 1 CMP 807 – Day 1 Kevin Galens.
Condor: BLAST Monday, July 19 th, 3:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
Overview of Bioinformatics 1 Module Denis Manley..
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee,
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
EB3233 Bioinformatics Introduction to Bioinformatics.
Bioinformatics and Computational Biology
Introduction to Bioinformatics Algorithms Algorithms for Molecular Biology CSCI Elizabeth White
Condor: BLAST Monday, 3:30pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Introduction to Bioinformatics and Functional Genomics
Cpt S 471/571: Computational Genomics
Genomes and Their Evolution
Introduction to Bioinformatics II
Cpt S 471/571: Computational Genomics
Bioinformatics Vicki & Joe.
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
The Study of Biological Information
(Really) Basic Molecular Biology
Applying principles of computer science in a biological context
Condor: BLAST Tuesday, Dec 7th, 10:45am
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Welcome to Introduction to Bioinformatics Computing aka BIC1

Team taught by Rhys Price Jones, Ph.D. –Bldg. 7B-2250; –Office Hours: Monday, Wednesday, Friday am Anne R. Haake, Ph.D. –Bldg ; –Office Hours: Tuesday 2-4 p.m; Friday 10-noon

The Focus of Bioinformatics Using computers to answer biological questions –Storage –Visualization –Analysis Using computers to figure which biological questions to ask

What is this course about? We will focus on analysis: –We will study techniques for quickly and effectively commandeering computing resources to the solution of problems raised in the realm of biology –We will study algorithms (more on this later..) that underlie many of the popular bioinformatics software packages The majority of these algorithms are concerned with sequence analysis (more on this, too…)

The Context of Bioalgorithms It is important to keep in mind that a mathematically perfect solution to an ideally posed problem may not be the most biologically relevant We need a flexibility, a willingness to rephrase the question, to rethink the process, to adapt and re-adapt

Course Structure 3 Classroom sessions each week to introduce the biological perspective and computational approaches for each biological problem 1 Laboratory session to give you hands-on experience in applying and refining computational methods in the context of biology

Readings Textbook: –Algorithms on Strings Trees and Sequences, Computer Science and Computational Biology, Dan Gusfield, Cambridge University Press, 1997, ISBN Papers from the current literature, as assigned Lecture notes and lab manuals as posted and linked to from the course home pagecourse home page Note that, unless otherwise noted, net-based resources should be accessed using Netscape. Other browsers may not be able to correctly interpret the JavaScript code.

Expectations – Computing Background There are skills you should possess in part already, but which will be significantly enhanced by being exercised in this course: –identifying and clearly phrasing a computational problem from a general biological query –rapidly developing, testing and analyzing tools for the solution of such problems if necessary –locating existing tools if not –understanding the capabilities and limitations of such tools

Computing Background – Specific skills Programming in a language such as Lisp, Perl, Scheme, Java, C, Python, etc. (if in doubt, ask!) Static and dynamic data structures – arrays, lists, trees, etc. Control structures, especially recursion Rapid prototyping, careful version control Understanding of mathematics for: –analysis –proof –modeling

Biological Motivation The fundamental building blocks of life are proteins –Enzymes, structural proteins, transport molecules, antibodies 100,000 or so different proteins in a human Their properties and interactions are what make us what we are

Biological Motivation What are proteins? –Polymers of amino acids (20 different) –Sequence of these amino acids (primary structure) determines the protein’s shape (secondary and tertiary structures) –Protein shape and chemical composition it’s amino acids determine protein function

Figure from W. Gilbert, Ph.D New Hampshire Biotech. Center So…in theory, we can infer protein function if we know the protein sequence

Biological Motivation How do we find out protein sequence? –Can sequence proteins directly but this has been technically difficult –Determine protein sequence from the DNA sequences that encode them

The Central Dogma Hereditary information for a complete individual stored in the DNA,which is self-replicating, and is organized into units of expression (genes) A gene is expressed in 2 steps: DNA is transcribed into RNA RNA is translated into protein

Most Protein Sequences Are Determined From DNA Sequence Why? Availability of DNA sequence information –Rapid development of DNA sequencing technology –Genomes of many different species have now been sequenced Difficulties? –Data sets are large –Cellular pathway from DNA to RNA to protein can be complicated

Some Genomes E. coli 4.6 x 10 6 bases –Approx. 4,000 genes Yeast15 x 10 6 bases –Approx. 6,000 genes Smallest human chromosome 50 x 10 6 bases Human 3 x 10 9 bases –Approx. 30,000 genes ?

The Computational Approach The nucleotide sequence of a genome contains all information necessary to produce a functional organism Therefore, we should, in theory, be able to duplicate this decoding using computers

Why Use Computational Techniques? The datasets are too large to analyze by hand Efficient algorithms are the only way to perform the analyses that we need to answer the biological questions

Common Biological Questions Answered Through Sequence Analysis Determine if an interesting DNA sequence has been seen by anyone else Find all the protein coding regions in a genome Infer the function of a new gene from a known one by matching two amino acid sequences Measure the evolutionary distance between species Predict local secondary structure of a peptide sequence, predict protein conformation, predict function Study protein families

Many Molecular Biology Problems on Sequences Can Be Formulated As String Matching Problems Comparing two or more strings for similarities Searching databases for related strings Looking for new patterns occurring frequently in DNA Reconstructing long strings of DNA from overlapping string fragments And more…

We Will Be Studying Algorithms For: Exact string matching Inexact string matching Sequence alignment problems Multiple alignment problems And more…

Role of Evolutionary Theory Central to computational biology Evolution is descent with modification, driven by: –Diversity: different individuals carry different variants of the basic blueprint –Mutations: DNA sequence can be changed –Selection bias

Role of Evolutionary Theory Related organisms have: –similar DNA –similar protein sequences –similar organization of genes Similar structures tend to have similar functions The bottom line: –evolution is the reason that we can assume similarity is meaningful in computational biology