Introduction to Bioinformatics 236523/234525 Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site :

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

COT 6930 HPC and Bioinformatics Bioinformatics Resources and Databases Xingquan Zhu Dept. of Computer Science and Engineering.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
AP Biology Teaching Biology Through Bioinformatics Real world genomics research in your classroom Kim B. Foglia Division Ave. High School Levittown.
Structural Genomics and Human Health
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
Jianlin Cheng, PhD Informatics Institute, Computer Science Department University of Missouri, Columbia Fall, 2011.
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistant: Shula Shazman Sivan Bercovici Course web site :
Archives and Information Retrieval
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
1 Pairwise Sequence Alignment. 2 Biological motivation Main algorithms for pairwise sequences alignment ATTGCGTCGATCGCAC-GCACGCT ATTGCAGTG-TCGAGCGTCAGGCT.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
The Cell, Central Dogma and Human Genome Project.
Data visualization in the post-genomics era Carol Morita Genentech, Inc.
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistance: Oleg Rokhlenko Ydo Wexler
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Bioinformatics in the Biology Curriculum Gloria Rendon NCSA July 2008.
Topics The topics: basic concepts of molecular biology more on Perl
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
Proteomics Understanding Proteins in the Postgenomic Era.
An Introduction to Bioinformatics Molecular Biology Databases.
Doug Brutlag 2011 Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine Genomics, Bioinformatics.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
On line (DNA and amino acid) Sequence Information
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Bioinformatics.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.
1 Review of Biological Database Utilization. 2 Biological Databases We will discuss: Usefulness to the bioinformaticist Database types Search methods.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Organizing information in the post-genomic era The rise of bioinformatics.
Introduction to Bioinformatics Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Rachelly Normand Edward Vitkin Course web site :
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
Overview of Bioinformatics 1 Module Denis Manley..
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
An overview of Bioinformatics. Cell and Central Dogma.
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistance: Martin Akerman Sivan Bercovici Course web site :
Bioinformatics and Computational Biology
Introduction to Bioinformatics Dr. Yael Mandel-Gutfreund TA: Oleg Rokhlenko.
Finding genes in the genome
Introduction to Bioinformatics Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Rachelly Normand Olga Karinski Course web site :
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Introduction to Bioinformatics
Archives and Information Retrieval
생물정보학 Bioinformatics.
Introduction to Bioinformatics /234525
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Bioinformatics Vicki & Joe.
Introduction to Bioinformatics
Introduction to Bioinformatic
CSE 5290: Algorithms for Bioinformatics Fall 2009
Introduction to Bioinformatics
Pairwise Sequence Alignment
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Introduction to Bioinformatics / Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site :

2 What is Bioinformatics?

3 Course Objectives To introduce the bioinfomatics discipline To make the students familiar with the major biological questions which can be addressed by bioinformatics tools To introduce the major tools used for sequence and structure analysis and explain in general how they work (limitation etc..)

4 Course Structure and Requirements 1.Class Structure 1.2 hours Lecture 2.1 hour tutorial 2. Home work Homework assignments will be given every second week The homework will be done in pairs. 5/5 homework assignments will be submitted 2. A final project will be conducted in pairs * Project will be presented as a poster –poster day 14.3

5 Grading 20 % Homework assignments 80 % final project

6 Literature list Gibas, C., Jambeck, P. Developing Bioinformatics Computer Skills. O'Reilly, Lesk, A. M. Introduction to Bioinformatics. Oxford University Press, Mount, D.W. Bioinformatics: Sequence and Genome Analysis. 2nd ed.,Cold Spring Harbor Laboratory Press, Advanced Reading Jones N.C & Pevzner P.A. An introduction to Bioinformatics algorithms MIT Press, 2004

7 What is Bioinformatics?

8 “The field of science in which biology, computer science, and information technology merge to form a single discipline” Ultimate goal: to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. What is Bioinformatics?

9 Central Paradigm in Molecular Biology mRNAGene (DNA)Protein 21 ST centaury GenomeTranscriptomeProteome

10 From DNA to Genome Watson and Crick DNA model

First human genome draft First genome Hemophilus Influenzae Yeast genome

12 Total Eukaryotes Bacteria Archaea Complete Genomes

1,000 Genomes Project: Expanding the Map of Human Genetics Researchers hope the effort will speed up the discovery of many diseases's genetic roots 13

14 Main Goal: To understand the living cell AnnotationComparative genomics Functional genomics genomes… What’s Next ? The “post-genomics” era Systems Biology

From … genomes To…Understanding living cells

16 CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT TGAAAAACGTA Annotation

17 Annotation Identify the genes within a given sequence of DNA Identify the sites Which regulate the gene Predict the function

18 How do we identify a gene in a genome? A gene is characterized by several features (promoter, ORF…) some are easier and some harder to detect…

19 CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT TGA AAAACGTA TF binding site promoter Ribosome binding Site ORF=Open Reading Frame CDS=Coding Sequence Transcription Start Site

20 Using Bioinformatics approaches for Gene hunting Relative easy in simple organisms (e.g. bacteria) VERY HARD for higher organism (e.g. humans)

21 Comparative genomics

22 Comparison between the full drafts of the human and chimp genomes revealed that they differ only by 1.23% How humans are chimps? Perhaps not surprising!!!

So where are we different ?? 23 Human ATAGCGGGGGGATGCGGGCCCTATACCC Chimp ATAGGGG--GGATGCGGGCCCTATACCC Mouse ATAGCG---GGATGCGGCGC-TATACC-A Human ATAGCGGGGGGATGCGGGCCCTATACCC Chimp ATAGGGGGGATGCGGGCCCTATACCC Mouse ATAGCGGGATGCGGCGCTATACCA

24 And where are we similar ??? VERY SIMAILAR Conserved between many organisms VERY DIFFERENT

25 Functional genomics

26 TO BE IS NOT ENOUGH In any time point a gene can be functional or not

27 From the gene expression pattern we can lean: What does the gene do ? When is it needed? What other genes or proteins interact with it? ….. What's wrong??

28 Systems Biology

Jeong et al. Nature 411, (2001) Biological networks

What can we learn from a network?

What can we learn from Biological Networks Is the protein essential for the organism ? Is it a good drug targets? What can we learn about this protein

What of all this will we learn in the course? 32 The course will concentrate on the bioinformatics tools and databases which are used to : Annotate genes, Compare genes and genomes Infer the function of the genes and proteins Analyze the interactions between genes and proteins ETC….

33 Biological Databases The different types of data are collected in database –Sequence databases –Structural databases –Databases of Experimental Results All databases are connected

34 Sequence databases Gene database Genome database Disease related mutation database ………….

35 Genome Browsers Easy “walk” through the genome UCSC Genome Browser

36 Disease related database

37 Sickle Cell Anemia Due to 1 swapping an A for a T, causing inserted amino acid to be valine instead of glutamine in hemoglobin Image source:

38 Healthy Individual >gi| |ref|NM_ | Homo sapiens hemoglobin, beta (HBB), mRNA ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GG A GAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATG CTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC >gi| |ref|NP_ | beta globin [Homo sapiens] MVHLTP E EKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH

39 Diseased Individual >gi| |ref|NM_ | Homo sapiens hemoglobin, beta (HBB), mRNA ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GG T GAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATG CTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC >gi| |ref|NP_ | beta globin [Homo sapiens] MVHLTP V EKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH

40 Structure Databases 3-dimensional structures of proteins, nucleic acids, molecular complexes etc 3-d data is available due to techniques such as NMR and X-Ray crystallography

41

42 Databases of Experimental Results Data such as experimental microarray images- gene expression data Proteomic data- protein expression data Metabolic pathways, protein-protein interaction data, regulatory networks ETC………….

43 PubMed Service of the National Library of Medicine Literature Databases

44 Putting it all Together Each Database contains specific information Like other biological systems also these databases are interrelated

45 GENOMIC DATA GenBank DDBJ EMBL ASSEMBLED GENOMES GoldenPath WormBase TIGR PROTEIN PIR SWISS-PROT STRUCTURE PDB MMDB SCOP LITERATURE PubMed PATHWAY KEGG COG DISEASE LocusLink OMIM OMIA GENES RefSeq AllGenes GDB SNPs dbSNP ESTs dbEST unigene MOTIFS BLOCKS Pfam Prosite GENE EXPRESSION Stanford MGDB NetAffx ArrayExpress