Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Bioinformatics 236523/234525 Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site :

Similar presentations


Presentation on theme: "Introduction to Bioinformatics 236523/234525 Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site :"— Presentation transcript:

1 Introduction to Bioinformatics 236523/234525 Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site : http://webcourse.cs.technion.ac.il/236523

2 2 What is Bioinformatics?

3 3 Course Objectives To introduce the bioinfomatics discipline To make the students familiar with the major biological questions which can be addressed by bioinformatics tools To introduce the major tools used for sequence and structure analysis and explain in general how they work (limitation etc..)

4 4 Course Structure and Requirements 1.Class Structure 1.2 hours Lecture 2.1 hour tutorial 2. Home work Homework assignments will be given every second week The homework will be done in pairs. 5/5 homework assignments will be submitted 2. A final project will be conducted in pairs * Project will be presented as a poster –poster day 14.3

5 5 Grading 20 % Homework assignments 80 % final project

6 6 Literature list Gibas, C., Jambeck, P. Developing Bioinformatics Computer Skills. O'Reilly, 2001. Lesk, A. M. Introduction to Bioinformatics. Oxford University Press, 2002. Mount, D.W. Bioinformatics: Sequence and Genome Analysis. 2nd ed.,Cold Spring Harbor Laboratory Press, 2004. Advanced Reading Jones N.C & Pevzner P.A. An introduction to Bioinformatics algorithms MIT Press, 2004

7 7 What is Bioinformatics?

8 8 “The field of science in which biology, computer science, and information technology merge to form a single discipline” Ultimate goal: to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. What is Bioinformatics?

9 9 Central Paradigm in Molecular Biology mRNAGene (DNA)Protein 21 ST centaury GenomeTranscriptomeProteome

10 10 From DNA to Genome Watson and Crick DNA model 1955 1960 1965 1970 1975 1980 1985

11 11 1995 1990 2000 First human genome draft First genome Hemophilus Influenzae Yeast genome

12 12 Total 1379 294 Eukaryotes 133 39 Bacteria 1152 235 Archaea 94 23 Complete Genomes 2010 2005

13 1,000 Genomes Project: Expanding the Map of Human Genetics Researchers hope the effort will speed up the discovery of many diseases's genetic roots 13

14 14 Main Goal: To understand the living cell AnnotationComparative genomics Functional genomics 25000 genomes… What’s Next ? The “post-genomics” era Systems Biology

15 From ….25000 genomes To…Understanding living cells

16 16 CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT.................... TGAAAAACGTA Annotation

17 17 Annotation Identify the genes within a given sequence of DNA Identify the sites Which regulate the gene Predict the function

18 18 How do we identify a gene in a genome? A gene is characterized by several features (promoter, ORF…) some are easier and some harder to detect…

19 19 CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT............................................... TGA AAAACGTA TF binding site promoter Ribosome binding Site ORF=Open Reading Frame CDS=Coding Sequence Transcription Start Site

20 20 Using Bioinformatics approaches for Gene hunting Relative easy in simple organisms (e.g. bacteria) VERY HARD for higher organism (e.g. humans)

21 21 Comparative genomics

22 22 Comparison between the full drafts of the human and chimp genomes revealed that they differ only by 1.23% How humans are chimps? Perhaps not surprising!!!

23 So where are we different ?? 23 Human ATAGCGGGGGGATGCGGGCCCTATACCC Chimp ATAGGGG--GGATGCGGGCCCTATACCC Mouse ATAGCG---GGATGCGGCGC-TATACC-A Human ATAGCGGGGGGATGCGGGCCCTATACCC Chimp ATAGGGGGGATGCGGGCCCTATACCC Mouse ATAGCGGGATGCGGCGCTATACCA

24 24 And where are we similar ??? VERY SIMAILAR Conserved between many organisms VERY DIFFERENT

25 25 Functional genomics

26 26 TO BE IS NOT ENOUGH In any time point a gene can be functional or not

27 27 From the gene expression pattern we can lean: What does the gene do ? When is it needed? What other genes or proteins interact with it? ….. What's wrong??

28 28 Systems Biology

29 Jeong et al. Nature 411, 41 - 42 (2001) Biological networks

30 What can we learn from a network?

31 What can we learn from Biological Networks Is the protein essential for the organism ? Is it a good drug targets? What can we learn about this protein

32 What of all this will we learn in the course? 32 The course will concentrate on the bioinformatics tools and databases which are used to : Annotate genes, Compare genes and genomes Infer the function of the genes and proteins Analyze the interactions between genes and proteins ETC….

33 33 Biological Databases The different types of data are collected in database –Sequence databases –Structural databases –Databases of Experimental Results All databases are connected

34 34 Sequence databases Gene database Genome database Disease related mutation database ………….

35 35 Genome Browsers Easy “walk” through the genome UCSC Genome Browser http://genome.ucsc.edu/ http://genome.ucsc.edu/

36 36 Disease related database

37 37 Sickle Cell Anemia Due to 1 swapping an A for a T, causing inserted amino acid to be valine instead of glutamine in hemoglobin Image source: http://www.cc.nih.gov/ccc/ccnews/nov99/

38 38 Healthy Individual >gi|28302128|ref|NM_000518.4| Homo sapiens hemoglobin, beta (HBB), mRNA ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GG A GAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATG CTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC >gi|4504349|ref|NP_000509.1| beta globin [Homo sapiens] MVHLTP E EKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH

39 39 Diseased Individual >gi|28302128|ref|NM_000518.4| Homo sapiens hemoglobin, beta (HBB), mRNA ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GG T GAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATG CTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGC TCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC >gi|4504349|ref|NP_000509.1| beta globin [Homo sapiens] MVHLTP V EKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVAN ALAHKYH

40 40 Structure Databases 3-dimensional structures of proteins, nucleic acids, molecular complexes etc 3-d data is available due to techniques such as NMR and X-Ray crystallography

41 41

42 42 Databases of Experimental Results Data such as experimental microarray images- gene expression data Proteomic data- protein expression data Metabolic pathways, protein-protein interaction data, regulatory networks ETC………….

43 43 PubMed Service of the National Library of Medicine http://www.ncbi.nlm.nih.gov/pubmed/ Literature Databases

44 44 Putting it all Together Each Database contains specific information Like other biological systems also these databases are interrelated

45 45 GENOMIC DATA GenBank DDBJ EMBL ASSEMBLED GENOMES GoldenPath WormBase TIGR PROTEIN PIR SWISS-PROT STRUCTURE PDB MMDB SCOP LITERATURE PubMed PATHWAY KEGG COG DISEASE LocusLink OMIM OMIA GENES RefSeq AllGenes GDB SNPs dbSNP ESTs dbEST unigene MOTIFS BLOCKS Pfam Prosite GENE EXPRESSION Stanford MGDB NetAffx ArrayExpress


Download ppt "Introduction to Bioinformatics 236523/234525 Lecturer: Prof. Yael Mandel-Gutfreund Teaching Assistance: Shai Ben-Elazar Idit kosti Course web site :"

Similar presentations


Ads by Google