CS 7010: Computational Methods in Bioinformatics (course review) Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University.

Slides:



Advertisements
Similar presentations
Microarray Data Analysis Day 2
Advertisements

CS 7010: Computational Methods in Bioinformatics (course introduction)
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Bioinformatics For MNW 2 nd Year Jaap Heringa FEW/FALW Integrative Bioinformatics Institute VU (IBIVU) Tel ,
Let’s investigate some of the Hot Areas of Life Sciences in more detail: Genomics –Human Genome Project –Use of Microarrays or DNA chips Bioinformatics.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
An Introduction to Bioinformatics (high-school version) Ying Xu Institute of Bioinformatics, and Biochemistry and Molecular Biology Department University.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Using Bioinformatics to Make the Bio- Math Connection The Confessions of a Biology Teacher.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Bioinformatics and Phylogenetic Analysis
Workshop in Bioinformatics 2010 What is it ? The goals of the class… How we do it… What’s in the class Why should I take the class..
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
Topics in Computational Biology (COSI 230a) Pengyu Hong 09/02/2005.
Ayesha Masrur Khan Spring Course Outline Introduction to Bioinformatics Definition of Bioinformatics and Related Fields Earliest Bioinformatics.
Bioinformatics Curriculum Guidelines: Toward a Definition of Core Competencies Lonnie Welch School of Electrical Engineering & Computer Science Biomedical.
EECS 395/495 Algorithmic Techniques for Bioinformatics General Introduction 9/27/2012 Ming-Yang Kao 19/27/2012.
Presented by Liu Qi An introduction to Bioinformatics Algorithms Qi Liu
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
8/15/2015Bioinformatics and Computational Biology Undergraduate Major 1 Iowa State University College of Liberal Arts and Sciences Bioinformatics & Computational.
Medical Informatics Basics
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
A number of slides taken/modified from:
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Ch10. Intermolecular Interactions and Biological Pathways
Bioinformatics.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Bioinformatics Timothy Ketcham Union College Gradutate Seminar 2003 Bioinformatics.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Problem Statement and Motivation Key Achievements and Future Goals Technical Approach Investigators: Yang Dai Prime Grant Support: NSF High-throughput.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
Medical Informatics Basics
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Finish up array applications Move on to proteomics Protein microarrays.
Introduction to Bioinformatics Biostatistics & Medical Informatics 576 Computer Sciences 576 Fall 2008 Colin Dewey Dept. of Biostatistics & Medical Informatics.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Overview of Bioinformatics 1 Module Denis Manley..
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
An overview of Bioinformatics. Cell and Central Dogma.
Algorithms for Biological Sequence Analysis Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Bioinformatics and Computational Biology
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
High throughput biology data management and data intensive computing drivers George Michaels.
Effect of Alcohol on Brain Development NormalFetal Alcohol Syndrome.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
BME435 BIOINFORMATICS.
Bioinformatics Overview
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
생물정보학 Bioinformatics.
High-throughput Biological Data The data deluge
Genomes and Their Evolution
Genome organization and Bioinformatics
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Introduction to Bioinformatic
Presentation transcript:

CS 7010: Computational Methods in Bioinformatics (course review) Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia Columbia, MO (O)

Technical Definitions NIH ( Bioinformatics: “research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, represent, describe, store, analyze, or visualize such data”. Computational Biology: “the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems”.

Course Topics l Data interpretation in analytical technologies l Data management and computational infrastructure l Discovery from data mining l Modeling, prediction and design l Theoretical in silico biology Cover classical/mainstream bioinformatics problems from computer science prospective

Discovery from Data Mining (I)

l Data source å Genomic / protein sequence å Microarray data å Protein interaction l Complicated data å Large-scale, high-dimension å Noisy (false positives and false negatives) Discovery from Data Mining (II)

Pattern/knowledge discovery from data å many biological data are generated by biological processes which are not well understood å interpretation of such data requires discovery of convoluted relationships hidden in the data X which segment of a DNA sequence represents a gene, a regulatory region X which genes are possibly responsible for a particular disease Discovery from Data Mining (III)

Modeling, Prediction and Design (I) l Modeling and prediction of biological objects/processes å Sequence comparison å Secondary structure prediction å Gene finding å Regulatory sequence identification

l Prediction of outcomes of biological processes å computing will become an integral part of modern biology through an iterative process of l From prediction to engineering design å Drug design å Protein structure prediction to protein engineering å Design genetically modified species model formulation computational prediction experimental validation Modeling, Prediction and Design (II)

Scope of Bioinformatics data management; data mining; modeling; prediction; theory formulation engineering aspect scientific aspect bioinformatics an indispensable part of biological science genes, proteins, protein complexes, pathways, cells, organisms, ecosystem computer science, biology, statistics mathematics, physics, chemistry, engineering,…

Bioinformatics Foundations l Technology l Biology/medicine l Computer Science l Statistics l From interdisciplinary field to a distinct discipline

Course Coverage l A general introduction to the field of bioinformatics å problems definitions: from biological problem to computable problem å key computational techniques l A way of thinking: tackling “biological problem” computationally å how to look at a biological problem from a computational point of view å how to formulate a computational problem to address a biological issue å how to collect statistics from biological data å how to build a computational model å how to design algorithms for the model å how to test and evaluate a computational algorithm å how to access confidence of a prediction result

Dong’s top 10 list for computational methods in BI 1. Dynamic programming 2. Neural network 3. Hidden Markov Model 4. Hypothesis test 5. Bayesian statistics 6. Clustering 7. Information theory 8. Support Vector Machine 9. Maximum likelihood 10. Sampling search (Gibbs, Monte Carlo, etc)

1. “Solved” problems 2. “Developed” areas with remaining challenges hard to solve 3. Developing areas 4. Emergent areas 5. Future directions Research Areas

l DNA sequence base calling and assembly l Pairwise sequence comparison l Protein secondary structure prediction l Disordered region in proteins l Transmembrane segment prediction l Subcellular localization l Signal peptide prediction l Protein geometry l Homology modeling l Physical/genetic mapping informatics “Solved” Problems

l Gene finding l Phylogenetic tree construction and evolution l Protein docking l Drug design l Protein design l Linkage analysis and quantitative traits (QTL) l Microarray data collection l Gene expression clustering “ Developed ” areas with remaining challenges

l Multiple sequence comparison and remote homolog search l Repetitive sequence analysis l Protein structure comparison l Protein tertiary structure prediction l RNA secondary structure prediction l Regulatory sequence analysis l Computational proteomics l Protein interaction networks l Gene ontology and function prediction l Computational neural science and applications in various species and systems (e.g., cancer) Developing Areas

l Pathway (regulatory network) prediction l ChIP-chip analysis l Tiling array analysis l Haplotype/SNP analysis l Computational comparative genomics l Text (literature) mining l Small RNA and anti-sense regulation l Alternative splicing prediction l Computational metabolomics Emergent Areas

l Genome semantics l Membrane protein structure prediction l RNA tertiary structure prediction l Post-translational modification l Dynamics of regulatory networks l Virtual cell/organism modeling l Phenotype-genotype relationship l … (nobody knows) Possible future directions

Where the science is going? (1) l Bioinformatics has been a “technology” to biological research: Interpretation of data generated by bench biologists l We start to see a trend that computational predictions can guide experimental design l With more high-throughput technologies become available, discovery-driven science will play increasingly more important roles in biology research l With computational techniques continue to mature for biological applications, we will see more and more computational applications with powerful prediction capabilities

Where the science is going? (2) l Like physics, where general rules and laws are taught at the start, biology will surely be presented to future generations of students as a set of basic systems duplicated and adapted to a very wide range of cellular and organismic functions, following basic evolutionary principles constrained by Earth’s geological history. --Temple Smith, Current Topics in Computational Molecular Biology

Major research centers (1) l National Center for Biotechnology Information (NCBI) of NIH ( å the home of many important databases including GenBank å the home of many important bioinformatics tools including BLAST

l European Molecular Biology Laboratory (EMBL) ( å has some of the most powerful research groups in bioinformatics å Has numerous tools and databases Major research centers (2)

l Sanger Institute ( l The Institute for Gonomic Research (TIGR, l Swiss-Prot ( Major research centers (3)

Major Universities in US l University of California at Santa Cruz l University of California at San Diego l Washington University l University of Southern California l Stanford University l Columbia University l Boston University l Harvard University l MIT l Virginia Tech

Major journals å Bioinformatics å Nucleic Acids Research å Genome Research å Journal of Computational Biology å Journal of Bioinformatics and Computational Biology å In silico Biology å Briefings in bioinformatics å Applied Bioinformatics å IEEE/ACM Transactions on Computational Biology and Bioinformatics å Proteins: structure, function and bioinformatics å Journal of Computer Science and Technology å Genomics, Proteomics and Bioinformatics å …

Major conferences å Intelligent Systems for Molecular Biology (ISMB) å Annual Conference on Computational Biology (RECOMB) å IEEE/Computational Systems Bioinformatics Conference (CSB) å Pacific Symposium on Biocomputing (PSB) å European Conference on Computational Biology (ECCB) å IEEE Conference on Biotechnology and Bioinformatics (BIBE) å International Workshop on Genome Informatics (GIW) å Asia-Pacific Bioinformatics Conference (APBC) å …

Academicians l Michael Waterman l Phil Green l Gene Myers l Barry Honig l No Nobel Price Winner yet…

Discussions l Scope of the new biology (large-scale) l Technology (tool development) vs. science (biological application) l Knowledge vs. prediction l Experimental vs. computational/theoretical l First principle vs. empirical / statistical l Automated vs. curated One machine can do the work of fifty ordinary men. No machine can do the work of one extraordinary man.

Choosing Bioinformatics as Career - 1 l Field outlook l Must be a believer of bioinformatics (for its value to science) l Must have a strong motivation and willing to walk extra miles (learn more disciplines) l Technologist vs. technician

Choosing Bioinformatics as Career - 2 l Molecular & cellular and evolutionary biology å understanding the science l Computational, mathematical, and statistical sciences å mastering the techniques l High-throughput measurement technologies å Knowing what biological data are obtainable