Download presentation
Presentation is loading. Please wait.
1
Bioinformatics: Data-driven molecular biology
Mikhail Gelfand A.A.Kharkevich Institute for Information Transmission Problems, RAS Moscow II Испано-российский форум по информационным и коммуникационным технологиям Madrid, / IX / 2009
2
Exponential increase of data volume
red – papers (PubMed) blue – sequence fragments (GenBank) green – nucleorides (GenBank) of 18 million papers in PubMed, ~675 thousand have keywords “bioinformat* OR comput*”
3
622 complete genomes (bacteria)
4
>45 thousand Google hits on “genome deciphered”
Top 10 hits: bioremediation bacterium Pseudomonas agriculture and biotech crop and biofuel plant Sorghum rice medicine pathogenic bacterium Staphylococcus SARS (atypical pneumonia) virus Brugia worm (elephantiasis) individual genome (medicine) James Watson science / model organism macaque science / evolution mammoth (mitochondrial) platypus
5
Sequencing is just the beginning
Bacterial genome: several million nucleotides 600 through 9,000 genes (~ 90% of a genome codes for proteins) This slide: 0,1% of the Escherichia coli genome Human genome: 3 billion nucleotides, thousand genes polymorphisms (individual differences): ~ 1 for 1000 nucleotides differences between human and chimpanzee: ~ 1 of 100
6
Not just genomes Other types of large-scale experiments / datasets:
State of the genome (gene expression) methylation nucleosome positioning histone modifications Transcriptomics, protein abundance (gene expression) Protein-protein interactions signaling etc. functional complexes Protein-DNA interactions (regulation) etc. etc.
7
Goals Functional annotation of genes and proteins
biological function regulation (in what conditions) Functional annotation of genomes metabolic reconstruction and modeling regulatory networks and development prediction of organism properties from its genome
8
Applications: biotechnology
Improvement of production strains (chemistry, pharma, food industry) via modeling of metabolic pathways New enzymes (new functions, stress tolerance) via sequencing and functional annotation Biofuels fast-growing, stress-tolerant plants; identification of genes microbes as producers of ethanol or fatty acids: targeted genome design
9
Applications: medicine and pharma
Personalized medicine identification of predisposing alleles: lifestyle pharmacogenomics (metabolic alleles) diagnostics Drug targets (chronic disease) analysis of signaling pathways Anti-infectives identification of drug targets Drug design; identification of drug candidates modeling of protein structure and interactions of proteins with small molecules
10
Methods. Integration of data
Systems biology: Integration of diverse datasets for one organism Comparative genomics: Simultaneous analysis of genomic data for many organisms Comparative systems biology: understanding the evolution of gene regulation and expression, signaling etc. Comparative structural biology
11
Bioinformatics in Russia
Few high-throughput experiments Open data Collaborations Theory (evolution), methods, algorithms Highlights: Evolution (IITP RAS) and taxonomy (IPCB MSU) Regulation (FBB MSU, GosNIIGenetika, IITP RAS, ICaG SB RAS) Annotation (FBB MSU, IITP RAS) Protein Structure (IPR RAS, IMB RAS, IPCB MSU, BF MSU) Modeling Metabolism (IPCB MSU, ICaG SB RAS) Regulation (SpBSPU , ICaG SB RAS) Drug design (IBMC RAMS)
12
Research and Training Center “Bioinformatics”, Institute of Information Transmission Problems (5 years: ) Molecular evolution Alternative splicing as a driver of evolution in eukaryotes Positive selection Comparative genomics of regulation in bacteria Evolution of regulatory pathways Protein-DNA interactions Annotation Gene recognition Functional annotation Regulation
13
Comparative genomics in action: confirmed predictions
Regulatory mechanisms riboswitches (riboflavin – vitamin B1, thiamin – vitamin B2) antisense regulation of the methionine-cysteine pathway role of the ribosome in zinc homeostasis Regulators: NrdR, MtaR/MetR, CmbR, NiaR Enzymes: FadE, ThiN, TenA, CobZ, CobX/CbiZ, PduX, NagP, NagB-II Microcins (capistruin, Burkholderia thailandensis) Transporters АВС-transporters with universal energizing components: Co, Ni, biotin (vitamin H), thiamin (vitamin B2), riboflavin (vitamin B1) other: threonin, methionin, oligogalacturonides, N-acetylglucosamin, corrinoids, nyacin, riboflacin, Co Regulatory motifs: nitrogen-fixation, fatty acid biosynthesis, iron homeostasis, catabolism of chitin and pectin Regulatory sites: several dozens
14
Functional annotation of genomes
First Russian bacterial genome, Acholeplasma laidlawii (2008): sequencing and proteomics: Institute of Physico-Chemical Medicine; annotation: IITP: ~1,5 Mb; ~1400 genes. Established function for ~80% genes; metabolic reconstruction
15
Publications (refereed)
16
Collaborations European Laboratory of Molecular Biology * Germany
Humboldt University, Berlin Munich Technical University France Lyon University United Kingdom University of East Anglia Spain Center for Genome Regulation (Barcelona) USA MIT Burnham Institute * Lawrence Berkeley National Laboratory * Stowers Institute * Rutgers University China China-Germany Partner Institute of Molecular Genetics (Shanghai) Industry Biomax (Germany) Interated Genomics (USA) Bold: on-going * Former students
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.