Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:

Similar presentations


Presentation on theme: "Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:"— Presentation transcript:

1 Bioinformatics David Brodin David.Brodin@biosci.ki.se BEA core facility www.bea.ki.se MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page: www.bea.ki.se/biomedicin_v42/

2 Introduction to Bioinformatics -History of Bioinformatics -Need for computers -Computational Biologi -Fields of Bioinformatics -Bioinformatic tools Homologi, sekvensanalys och fylogenetik Introduction to Microarrays & Lab Lecture Content Monday Tuesday Wednesday Mass spectrometry Web Databases, bioinformatic tools etc Genotyping Arrays Tiling Arrays Computer Lab

3 Major advances in the field of molecular biology genomic technologies Explosive growth in the biological information generated by the scientific community Need of computerized databases to store, organize, and index the data and for specialized tools to view and analyze the data Need for Computers what computer science is to molecular biology is like what mathematics has been to physics...... -- Larry Hunter, ISMB’94

4 History of DNA Sequencing History of Bioinformatics Adapted from Messing & Llaca, PNAS (1998)

5 History of Bioinformatics

6 Early database: The Atlas of Protein Sequences was available on Digital Tape in 1978, and by modem 1980. Early programs: restriction enzyme sites, pattern finding, promoters, etc… circa 1978. 1982: DDBJ/EMBL/GenBank are created as a public repository of genetic sequence information. 1983: NIH funds the PIR (Protein Information Resource) database. 1988: Pearson and Lipman create FASTA Number of published base pairs 1971 First published DNA sequence 12 1977 PhiX174 5,375 1982 Lambda 48,502 1992 Yeast Chromosome III 316,613 1995 Haemophilus influenza 1,830,138 1996 Saccharomyces 12,068,000 1998 C. elegans 97,000,000 2000 D. melanogaster 120,000,000 2001 H. sapines (draft) 2,600,000,000 2003 H. sapiens 2,850,000,000 History of Bioinformatics

7 Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. -National Institute of Health (NIH) Computational Biology In the early days of bioinformatics a major concern was creation and maintenance of databases to store biological information, involving design issues and development of complex user interfaces. Today the most pressing task involves the analysis and interpretation of various types of data, including nucleotide and amino acid sequences, protein domains, and protein structures. The actual process of analyzing and interpreting data is referred to as computational biology. Biology in the 21st century is being transformed from a lab-based science to an information science as well.

8 easy access to the information a method for extracting only that information needed to answer a specific biological question Biological Databases A biological database is a large, organized body of persistent data, usually associated with computerized software designed to update, query, and retrieve components of the data stored within the system. For researchers to benefit from the data stored in a database, two additional requirements must be met: the input sequence with a description of the type of molecule ID of sequence the scientific name of the source organism contact name literature citations associated with the sequence. Nucleotide sequence record:

9 Analysis and interpretation of various types of biological data Developement of new algorithms and statistics with which to asses biological information Development and implementation of tools that enable efficient access and management of different types of information Sub-disciplines, challenges & goals Need to feel comfortable in interdisciplinary area Depend on others for primary data Need to address important biological and computer science problems Challenges of working with bioinformat.: Important sub-disciplines: Important goal of bioinformatics: understanding basic biological processes and, in turn, advances in the diagnosis, treatment, and prevention of many genetic diseases.

10 Genomes Nucleotide Sequences Protein Sequences Macromolecular Structures Small Molecules Gene Expression Molecular Interactions Reactions & Pathways Protein Families Taxonomy Ontologies Sequence Similarity & Analysis Structure Analysis Fields of Bioinformatics The ”omics” Series: Genomics: Gene identification & characterization Transcriptomics: Expression profiles of mRNA Proteomics:Functions & interactions of proteins Structural Genomics: Large scale structure determination Cellinomics: Metabolic Pathways, Cell-cell interactions Pharmacogenomics: Genome-based drug design

11 I cloned a gene –is it a known gene? Does the sequence match? Is the sequence any good? Is the sequence similar to other known sequences? Which gene family does it belong to? The gene I´m interested in was found in another organism, but not in mine. How can I look for it? How is the gene expressed in different types of tissues? What is the biological function of the protein encoded by the gene? Is the gene associated with any disease? Typical Questions Biological problems that computers can help with: Increasingly, biological studies begin with a scientist conducting vast numbers of database and web site searches to formulate specific hypotheses or to design large-scale experiments.

12 Many different bioinformatic tools avaiable over the internet free of charge to whoever wishes to use them Also many commersial software packages avaiable Some bioinformaticians write their own tools for specialized tasks Bioinformatic tools Many platforms avaiable for software development...

13 Open Source in the life sciences: Present in all areas of bioinformatics Some very well known examples of tools used in industry and academic circles include: –BLAST –EMBOSS –EnsEMBL –GenScan –Bioconductor Open Access: Unrestricted access to data Allows all to work and make discoveries Discoveries are not necessarily open access Open access is applicable to any kind of data you want to apply it to: –Sequence data (DNA, RNA or protein) –Gene expression data –Protein-protein interaction data –Publication Open Source & Open Access

14 Precise, predictive model of transcription initiation and termination: ability to predict where and when transcription will occur in a genome Precise, predictive model of RNA splicing/alternative splicing: ability to predict the splicing pattern of any primary transcript in any tissue Precise, quantitative models of signal transduction pathways: ability to predict cellular responses to external stimuli Determining effective protein:DNA, protein:RNA and protein:protein recognition codes Accurate ab initio protein structure prediction Rational design of small molecule inhibitors of proteins Mechanistic understanding of protein evolution: understanding exactly how new protein functions evolve Mechanistic understanding of speciation: molecular details of how speciation occurs Continued development of effective gene ontologies - systematic ways to describe the functions of any gene or protein Education: development of appropriate bioinformatics curricula for secondary, undergraduate and graduate education Top 10 Future Challenges Chris Burge, Ewan Birney, Jim Fickett. Genome Technology, issue No. 17, January, 2002


Download ppt "Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:"

Similar presentations


Ads by Google