1 Delivering Bioinformatics Training: Bridging the Gaps Between Computer Science and Biomedicine Christopher Dubay Ph.D. James M. Brundege Ph.D. William Hersh M.D. Kent Spackman M.D., Ph.D. Division of Medical Informatics & Outcomes Research, OHSU
2 Overview A View of Bioinformatics –Background & Significance –An Integrative Information Science The Gap: –Between Computer Science & Biomedicine OHSU Bioinformatics Education Questions & Next Steps
3 What is bioinformatics? Two perspectives –A set of tools & techniques to support biological science »Equivalent in scope to new assay methodology or new investigative techniques –A science that supports the systematic development and analysis of such tools »Investigation of the set of scientific principles forming the foundation for successful bioinformatics applications
4 Background of Students Computer & information science –Need to understand existing tools, scientific approach, and needs of biological research Biomedicine –Need to learn a set of tools and skills –May also need to understand the deeper scientific issues
5 The Gap I Biological scientists and investigators can’t build their own tools Computer scientists don’t know what tools to build
6 The Gap II Putting a biological investigator and a system implementer together in a room doesn’t solve the problem –Barriers include: »Language »Methodology »Conceptualization
7 The Gap III Computer science is a “science of the artificial” –Mainly concerned with human artifacts i.e. creations limited mainly by conceptualization and imagination Biomedicine is a science of discovery –Mainly concerned with how organisms function, the limiting factors are often a result of limits of investigative methods and tools
8 Bioinformatics defined in terms of tools: General Tools –WP, Spreadsheets, Robotics, Instrumentation Communications – , Networks, Internet & World Wide Web Databases –Storage, Organization Analysis Tools –Examination & Discovery Informatics Has Changed How Science is Done
9 Bioinformatics Significance
10 Bioinformatics defined in terms of a skill set Know-how with Practical Tools Cross Cultural Exchange –Language of Biomedical Research –Language of Informatics Solving Scientific Problems using Computers –Database Interoperation –Process Modeling & Data Visualization Bioinformatics is an Information Science
11 Informatics Skills: System Design & Implementation System Algorithms Problem Determination User Requirements DESIGN System Configuration & Maintenance DEVELOP Usage & Outcomes EVALUATE User Interface Data Distribution Data Production & Data Gathering Results & Interpretation
12 Where Should We Put the Emphasis? From the perspective of students entering the field and wondering about their future careers: Will bioinformatics turn out to be mainly a means to an end (a tool set)? Or will it turn out to be a viable science in its own right?
13 Bioinformatics as science: Ontologies Ontology: a “formalization of a conceptualization” Mathematics of ontologies: description logics Getting useful “discovery” answers from large databases probably will depend on a concept model i.e. an ontology A potential goal for students: Understanding how to build and use an ontology for bioinformatics applications
14 Bridging the Gap: Education
15 A little history Post-doctoral fellowship program at OHSU began in 1992 Both MD and PhD post-docs One PhD (genetics) informatics post- doc stayed on as faculty responsible for bioinformatics NLM funded bioinformatics curriculum development as a supplement
16 Current Educational Activities OHSU Three Term Curriculum in Bioinformatics –OHSU, PSU, OGI –Distance & Local Work with Biomedical Research Groups Research Information Systems Steering Committee –Bioinformatics Sub-Committee
17 OHSU Curriculum: Fall: –MINF 571: Computers in Bioscience –MINF 572: Bioinformatics Laboratory Winter: –MINF 573: Topics in Bioinformatics Spring: –MINF 575: Bioinformatics Systems Development Every student does a project every term
18 MINF 571: Computers in Bioscience. Course Objectives: Survey Course in Bioinformatics Understand basic computing and networking concepts. Introduce basic concepts of molecular biology and genetics. Focus on: biomolecular databases to retrieve and publish, genetic analysis, gene expression analysis, proteomics (structure / function), systems biology.
19 Course Description This course surveys the applications of informatics to biological problems, specifically those problems encountered in studies of genomes employing molecular biology and genetic techniques. The course follows a paradigm of how bioinformatics applications have been developed to aid in genome research in each step of biologic expression: from the DNA template, through transcription, translation, protein structure and function, as well as in analyzing meiotic events, and genetic epidemiology. The course is designed for both users and developers of bioinformatics applications, and thus addresses both the algorithms underlying the applications and their implementation. To equilibrate the backgrounds of biologists and computer scientists introductory lectures are provided during the first two weeks of class.
20 MINF 572: Bioinformatics Laboratory. Course Objectives: Internet Navigation. Introduction to the UNIX Operating System. Learn to use the GCG Program Suite: –UNIX Interface –SeqWeb Interface Use tools to visualize datasets (e.g. expression analysis) and biomolecules.
21 MINF 573: Topics in Bioinformatics. Course Objectives: Drill down into topics of choice from MINF 571 Focus on Databases Topics are presented in terms of their historical development, current literature, and future directions Journal Club for bioinformatics Lectures from those using the tools
22 MINF 573: Topics in Bioinformatics. Course Objectives: Examples of topic areas include: –DNA micro-array technology –bio-sequence analysis –functional genomics –data warehousing/data mining –genetic linkage analysis –Web and Internet based software development, etc.
23 MINF 575: Bioinformatics Systems Dev. Course Objectives: Learn bioinformatics software development best-practices and methodologies Emphasis on functionality prevalent in bioinformatics tools: –database interoperability, client/server and distributed computing designs, visual user interfaces, etc Paradigm for the course is that of a software development project
24 Course Participant Survey To gauge the educational audience –Skills –Interests –Directions Tailor course emphasis to participants Create a database of Skills and Interests –Useful for Course Projects
25 Course Participant Survey
26 Course Participant Survey
27 Course Participant Survey
28 Course Participant Survey
29 Bioinformatics Curriculum Elements Survey Four Schools: –OHSU (3 Courses) –UCSC (3 Courses) –UCLA (3 Courses) –Stanford (4 Courses) Created Matrix of Topics Created a Taxonomy for Topics Applications
30 Bioinformatics Curriculum Elements Data Storage and Retrieval Information Retrieval Molecular Biology Genomics Sequence Analysis Linkage Analysis Gene Expression Proteomics
31 Bioinformatics Curriculum Elements Software Engineering Laboratory Information Mgmt Sys Data acquisition software Data analysis software Statistical software Databases Internet
32 Bioinformatics Curriculum Elements Algorithms Applications Gene identification Sequence Alignment Molecular Models Techniques Dynamic programming Neural networks Hidden Markov Models Bayesian statistics
33 Bioinformatics Curriculum Elements Biological Models Molecular Models Structure-Function Biological Pathways Structural Models Anatomy Visible Human Project Human Brain Project Evolutionary Models Phylogeny
34 Bioinformatics Curriculum Elements Data Acquisition/Analysis Laboratory Information Mgmt Sys Automated data acquisition (high- throughput) DNA microarray Biostatistics Signal Processing Data Visualization
35 Bioinformatics Curriculum Elements Biostatistics Data analysis Statistical models Stochastic models Hidden Markov Models Bayesian statistics Biometry Clinical Applications Ethics in Bioinformatics
36 Academic s PrivateEnterprise Research Tools Training Methods Systems Development Theory Formalization (Model) DesignImplementation Example node: Academic:Research:Tools:Implementation Required bioinformatics skill set: programming, databases, system analysis Example application: Implementing a data warehouse for the storage of DNA microarray data used for gene expression profiling of tumor subtypes. Taxonomy for Bioinformatics Setting Activity Focus Product
37 To Bridge The Gap Give the broad picture of Bioinformatics to all disciplines Computer Scientists –‘Live in the Lab’ –Follow Biology Literature Biologists –Learn Software Development Principles –Exposure to new information technologies Do actual work: Course Projects
38 Next Steps Deploy Database of Bioinformatics Projects and Interests to: –Link projects with students Continued Development and Documentation of Bioinformatics Educational Elements & Paths Expanding Audience for Bioinformatics Education –Relevance Modules
39 Questions? Syllabi for Courses: