Download presentation
Presentation is loading. Please wait.
1
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly 2010-2011 Lecture 1 Introduction Aleppo University Faculty of technical engineering Department of Biotechnology
2
Main Lines Definition Definition Bioinformatics areas Bioinformatics areas Bioinformatics data Bioinformatics data – Data types – Applications for these data Next generation sequencing Next generation sequencing Bioinformatics algorithms Bioinformatics algorithms Joint international programming initiatives Joint international programming initiatives
3
Definition Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. Bioinformatics is the science of managing and analyzing biological data using advanced computing techniques Bioinformatics is the science of managing and analyzing biological data using advanced computing techniques Bioinformatics applies principles of information science to make the vast, diverse, and complex life sciences data more understandable and useful. Bioinformatics applies principles of information science to make the vast, diverse, and complex life sciences data more understandable and useful.
4
Definition There are two extremes in bioinformatics work There are two extremes in bioinformatics work – Tool users (biologists): know how to press the buttons and the biology but have no clue what happens inside the program – Tool shapers (informaticians): know the algorithms and how the tool works but have no clue about the biology
5
Bioinformatics areas Molecular sequence analysis Molecular sequence analysis 1.Sequence alignment 2.Sequence database searching 3.Motif discovery 4.Gene and promoter finding 5.Reconstruction of evolutionary relationships 6.Genome assembly and comparison
6
Bioinformatics areas Molecular structural analysis Molecular structural analysis 1.Protein structure analysis 2.Nucleic acid structure analysis 3.Comparison 4.Classification 5.prediction
7
Bioinformatics areas Molecular functional analysis Molecular functional analysis 1.gene expression profiling 2.Protein–protein interaction prediction 3.protein sub-cellular localization prediction 4.Metabolic pathway reconstruction 5.simulation
9
Bioinformatics data There is different data types usually used in bioinformatics There is different data types usually used in bioinformatics The same data may be used in different areas
10
Data types DNA sequencesDNA sequences RNA sequencesRNA sequences Expression (microarray) profileExpression (microarray) profile Proteome (x-ray, NMR) profileProteome (x-ray, NMR) profile Metabolome profileMetabolome profile Haplotype profileHaplotype profile Phenotype profilePhenotype profile
11
1- DNA Sequences Simple sequence analysis Simple sequence analysis – Database searching – Pairwise and multiple analysis Regulatory regions Regulatory regions Gene finding Gene finding Whole genome annotation Whole genome annotation Comparative genomics Comparative genomics
13
2- RNAs Splice variants Splice variants Tissue specific expression Tissue specific expression 2D structure 2D structure 3D structure 3D structure Single gene analysis Single gene analysis Microarray Microarray
14
2D and 3D structure of tRNA
15
2D and 3D structure of rRNA
16
Microarray 20,000 to 60,000 short DNA probes of specified sequences are orderly tethered on a small slide. Each probe corresponds to a particular short section of a gene. 20,000 to 60,000 short DNA probes of specified sequences are orderly tethered on a small slide. Each probe corresponds to a particular short section of a gene.
17
DNA microarrays measure the RNA abundance with either 1 channel (one color) or 2 channels (two colors). DNA microarrays measure the RNA abundance with either 1 channel (one color) or 2 channels (two colors). Stanford microarrays measure by competitive hybridization the relative expression under a given condition (fluorescent red dye Cy5 compared to its control (labeled with a green fluorescent dye, Cy3) (Two channels) Stanford microarrays measure by competitive hybridization the relative expression under a given condition (fluorescent red dye Cy5 ) compared to its control (labeled with a green fluorescent dye, Cy3) (Two channels) Affymetrix GeneChip has 1 channel and use eitherfluorescent red dye Cy5 or green fluorescent dye, Cy3 Affymetrix GeneChip has 1 channel and use either fluorescent red dye Cy5 or green fluorescent dye, Cy3 Microarray
19
3- Proteins Protein sequences analysis Protein sequences analysis – Database searching – Pairwise and multiple analysis 2D structure 2D structure 3D structure 3D structure Classification of proteins families Classification of proteins families Protein arrays Protein arrays
20
3D structure
21
Animation
22
4- Metabolome and molecular biology Metabolic pathways Metabolic pathways Regulatory networks Regulatory networks Helps to understand systems biology
24
5- Haplotype Molecular Markers Molecular Markers – RFLP – RAPD – SSR – ISSR – AFLP – DArT – SNP – ….
25
SNP
26
6- Phenotype Morphological data Morphological data Physiological data Physiological data Stresses tolerance Stresses tolerance Pathogenic infections Pathogenic infections Diseases resistance Diseases resistance Cancers types Cancers types ….. …..
27
Haplotype & Phenotype
28
Next Generation Sequencing SMRTHelicosAB SOLiD Illumina Solexa Roche GSFLX ABI 3730Sequencing Machine Target release 2010 20082007200620042000Launched 9642825-3535-70250-400800-1100Read length NA85M170M120M400K96Reads/run NA2 GB6 GB 100 MB0.1 MBThroughput per run NA $5.81 k$5.97 k$84.39High costCost/Mb
29
Short reads assembly problems
32
String algorithms String algorithms Dynamic programming Dynamic programming Machine learning (NN, k-NN, SVM, GA,..) Machine learning (NN, k-NN, SVM, GA,..) Markov chain models Markov chain models Hidden Markov models Hidden Markov models Markov Chain Monte Carlo (MCMC) algorithms Markov Chain Monte Carlo (MCMC) algorithms Stochastic context free grammars Stochastic context free grammars EM algorithms EM algorithms Gibbs sampling Gibbs sampling Clustering Clustering Tree algorithms (suffix trees) Tree algorithms (suffix trees) Graph algorithms Graph algorithms Text analysis Text analysis Hybrid/combinatorial techniques Hybrid/combinatorial techniques …. …. Algorithms in bioinformatics
33
Joint international programming initiatives Bioperl Bioperlhttp://www.bioperl.org/wiki/Main_Page Biopython Biopythonhttp://www.biopython.org/ BioTcl BioTclhttp://wiki.tcl.tk/12367 BioJava BioJavawww.biojava.org/wiki/Main_Page
34
Thank You
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.