Download presentation
Presentation is loading. Please wait.
Published byKerry Gilbert Modified over 8 years ago
1
1Introduction 1.0 Welcome to the Canadian Bioinformatics Workshops Bioinformatics, 7 th Ed Vancouver BC, Feb 16 – 28, 2004 Fiona Brinkman
2
2Introduction 1.0 Introduction Instructors, schedule, other things... Why does bioinformatics exist? What is bioinformatics? What are the big challenges in bioinformatics? –Research –Discipline differences between Bio and CS
3
3Introduction 1.0 David Wishart Francois Major Boris Steipe Fiona Brinkman Francis Ouellette Xuhua Xia David Ng Ian Donaldson
4
4Introduction 1.0 Joanne Jenn Will Sohrab Graeme Stefanie Mike Stephanie Erin Stephanie
5
5Introduction 1.0 Today 09.00 Introduction 10.00 Break 10.15 Biology Review 11:15 UNIX Tutorial 12.30 Lunch (on your own) 13.45 Biological Databases 15.15 Break 15.45 Public Lecture – Rob Holt 16.30 Questions 17.15 Reception
6
6Introduction 1.0 Administrative stuff Accounts on Linux machines –login: guest –password: cbw2004 Security, badges, fire exits, food
7
7Introduction 1.0 Assignments and Marking Scheme Day assignedDescription of Assignment Due dateMarks (%) Day 4, Thurs. Feb. 19 Lab 4.1 - BLAST9:00 a.m. Fri. Feb 20 15 Day 4, Thurs. Feb. 19 Integrated Assignment 11:00 a.m. Sat. Feb 28 35 Day 5, Fri. Feb. 20 Lab 5.4/6.2 - Phylogeny 1:00 p.m. Sat. Feb. 21 15 Day 7, Mon. Feb. 23 Lab 7.2 – Ensembl Web 9:00 a.m. Tues. Feb. 24 15 Day 8, Tues. Feb. 24 Lab 8.2 - Perl9:00 a.m. Fri. Feb. 27 20
8
8Introduction 1.0 Canadian Bioinformatics Workshops Bioinformatics Genomics Proteomics Developing the Tools You are here www.bioinformatics.ca
9
9Introduction 1.0 CBW Sponsors http://bioinformatics.ca/sponsors.php UOttawa
10
10Introduction 1.0 Questions?
11
11Introduction 1.0 Introduction - Objectives Why does bioinformatics exist What is bioinformatics What are the big challenges in bioinformatics –Research –Discipline differences between Bio and CS
12
12Introduction 1.0 Why is there Bioinformatics? Lots of new sequences being added - Automated sequencers - Genome Projects - EST sequencing, microarray studies, proteomics Patterns in datasets that can be analyzed using computers Huge datasets
13
13Introduction 1.0 Gramicidine S (Consden et al., 1947), partial insulin sequence (Sanger and Tuppy, 1951) 1961: tRNA fragments Francis Crick, Sydney Brenner, and colleagues propose the existence of transfer RNA that uses a three base code and mediates in the synthesis of proteins (Crick et al., 1961) General nature of genetic code for proteins. Nature 192: 1227- 1232. In Microbiology: A Centenary Perspective, edited by Wolfgang K. Joklik, ASM Press. 1999, p.384 First codon assignment UUU/phe (Nirenberg and Matthaei, 1961) Need for informatics in biology: origins
14
14Introduction 1.0 The key to the whole field of nucleic acid-based identification of microorganisms… …the introduction molecular systematics using proteins and nucleic acids by the American Nobel laureate Linus Pauling. Zuckerkandl, E., and L. Pauling. "Molecules as Documents of Evolutionary History." 1965. Journal of Theoretical Biology 8:357-366 Another landmark: Nucleic acid sequencing (Sanger and Coulson, 1975) Need for informatics in biology: origins
15
15Introduction 1.0 Need for informatics in biology: origins First genomes sequenced: –3.5 kb RNA bacteriophage MS2 (Fiers et al., 1976) –5.4 kb bacteriophage X174 (Sanger et al., 1977) –1.83 Mb First complete genome sequence of a free-living organism: Haemophilus influenzae KW20 (Fleischmann et al., 1995) –First multicellular organism to be sequenced: C. elegans (C. elegans sequencing consortium, 1998) Early databases: Dayhoff, 1972; Erdmann, 1978 Early programs: restriction enzyme sites, promoters, etc… circa 1978. 1978 – 1993: Nucleic Acids Research published supplemental information
16
16Introduction 1.0 Genbank doubles every 16 months (from the National Centre for Biotechnology Information) Shorter than Moore’s law (computer power doubling every 20 months!)
17
17Introduction 1.0 Today: So many genomes… As of Feb 6, 2004, how many…. published, complete genomes? eukaryotic genome projects in progress? prokaryote genome projects in progress? Guess closest number without going over!
18
18Introduction 1.0 Today: The Human Genome Project The genome sequence is complete - almost! –approximately 3.5 billion base pairs.
19
19Introduction 1.0 The next step is obviously to locate all of the genes and regulatory regions, describe their functions, and identify how they differ between different groups (i.e. “disease” vs “healthy”)… …bioinformatics plays a critical role
20
20Introduction 1.0 Implications for Biomedicine… and Bioinformatics Physicians will use genetic information to diagnose and treat disease. –Virtually all medical conditions (other than trauma) have a genetic component –Individualize drugs – reduce side effects –Single Nucleotide Polymorphisms (SNPs) Faster drug development research –More targets –Faster clinical trials (selected trial populations) Most Biologists will analyze gene sequence information in their daily work
21
21Introduction 1.0 Bioinformatics will help with……. DNA Sequencing Automated sequencers > 40,000 bp per day 500 bp reads must be assembled into complete sequences -Detecting errors especially insertions and deletions Data flow management
22
22Introduction 1.0 Bioinformatics will help with……. Similarity Searching Sequence Databases What is similar to my sequence? Searching gets harder as the databases get bigger - and quality changes Tools: BLAST and FASTA = time saving heuristics (approximate methods) Statistics + informed judgement of the biologist
23
23Introduction 1.0 Bioinformatics will help with……. Structure- Function Relationships Can we predict the function of protein molecules from their sequence? sequence > structure > function Prediction of some simple 3-D structures ( - helix, -sheet, membrane spanning, etc.)
24
24Introduction 1.0 Can we define evolutionary relationships between organisms by comparing DNA sequences -What is the molecular clock? -Lots of methods and software, what is the "correct" analysis? Bioinformatics will help with……. Phylogenetics
25
25Introduction 1.0 Top 10 Future Challenges for Bioinformatics Precise, predictive model of transcription initiation and termination: ability to predict where and when transcription will occur in a genome Precise, predictive model of RNA splicing/alternative splicing: ability to predict the splicing pattern of any primary transcript in any tissue Precise, quantitative models of signal transduction pathways: ability to predict cellular responses to external stimuli Determining effective protein:DNA, protein:RNA and protein:protein recognition codes Accurate ab initio protein structure prediction Rational design of small molecule inhibitors of proteins Mechanistic understanding of protein evolution: understanding exactly how new protein functions evolve Mechanistic understanding of speciation: molecular details of how speciation occurs Continued development of effective gene ontologies - systematic ways to describe the functions of any gene or protein Education: development of appropriate bioinformatics curricula for secondary, undergraduate and graduate education Chris Burge, Ewan Birney, Jim Fickett. Genome Technology, issue No. 17, January, 2002
26
26Introduction 1.0 What is Bioinformatics? Think – Pair – Share!
27
27Introduction 1.0 The Biologist in the Age of Information
28
28Introduction 1.0 The job of the biologist is changing As more biological information becomes available … –The biologist will spend more time using computers –The biologist will spend more time on data analysis –Biology will become a more quantitative science (think how the periodic table and atomic theory affected chemistry)
29
29Introduction 1.0 The challenge: Putting it all together The current state of the art requires the biologist to jump around from Web to mainframe to personal computer The trend is for integration Real Power: Being able to use and customize all resources
30
30Introduction 1.0 The Computer Scientist in the Age of Genomics
31
31Introduction 1.0 How much biology to understand? Increasing sophistication required for computational biologists in terms of biological knowledge What knowledge is important? What about all those exceptions? What problems are important?
32
32Introduction 1.0 What computational tools to understand? Perl is still used extensively in bioinformatics Open source is prevalent in bioinformatics (Linux, MySQL, bioperl) Need to be knowledgeable about both the standard bioinformatics algorithms and common tools that are based on them Appreciate the different databases and programs out there and what their benefits and fallacies are – databases have widely varying quality
33
33Introduction 1.0 High quality bioinformatics research: Excellent communication between biologists and computer scientists is key
34
34Introduction 1.0 The computer scientist and biologist compared Computer scientist Logic Problem-solving Process-oriented Algorithmic Optimizing Biologist Knowledge gathering Experimentally-focused Exceptions are as common as rules Describe work as a story Develop conclusions and models
35
35Introduction 1.0 Comp Sci vs Bio The result…. see the world differently ask different questions come to problems with different assumptions pick up on different details use different metaphors to organize knowledge have different sets of analytical tools at their disposal can even interact with people differently Coming together Communicate constantly! Gain a better understanding of different ways of thinking Try communicating in different ways Remember there are others…. Statisticians, mathematicians, engineers, physicists, chemists, physiologists….
36
36Introduction 1.0 Thoughts for the day What is bioinformatics? Why does bioinformatics exist? How can I use bioinformatics more effectively in my career? Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.