Introduction to Bioinformatics Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.

Slides:



Advertisements
Similar presentations
Martin John Bishop UK HGMP Resource Centre Hinxton Cambridge CB10 1 SB
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Centers of Excellence for Influenza Research and Surveillance 6 th Annual Meeting Aug 1, 2012 Status of IRD Development.
Standardizing Metadata Associated with NIAID Genome Sequencing Center Projects Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical.
Systems Biology Data Dissemination Working Group 25FEB2015.
Host cell responses to viral infection can be monitored by a variety of different high throughput experimental methodologies in order to understand the.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
University of Louisville The Department of Bioinformatics and Biostatistics.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Introduction to BioInformatics GCB/CIS535
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
What is “Biomedical Informatics”?. Biomedical Informatics Biomedical informatics (BMI) is the interdisciplinary field that studies and pursues.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical Informatics U.T. Southwestern Medical Center Standardizing Metadata Associated.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
A number of slides taken/modified from:
Standardizing Metadata Associated with NIAID Genome Sequencing Center Projects and their Implementation in NIAID Bioinformatics Resource Centers Richard.
Databases and tools to study the genomes of hundreds of pathogens, plants, and mammals Richard H. Scheuermann, Ph.D. Director of Informatics J. Craig Venter.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data and Analysis Victoria Hunt 1 *, R. Burke Squires 1, Jyothi Noronha 1,
Computational Methods to study Sequencing data -Meenakshi Sharma.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Bioinformatics and it’s methods Prepared by: Petro Rogutskyi
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Sequence Variation Identification and Functional/Structural Inference in the Influenza Research Database (IRD) and Virus Pathogen Resource (ViPR) Yun Zhang.
JM - 1 Introduction to Bioinformatics: Lecture I An Overview of the Course Jarek Meller Jarek Meller Division of Biomedical Informatics,
Statistical Tool for Identifying Sequence Variations That Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) July 22,
Finish up array applications Move on to proteomics Protein microarrays.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Yun Zhang J. Craig Venter Institute San Diego, CA, USA August 4, 2012 Integrated Bioinformatics Data and Analysis Tools for Herpesviridae.
Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E.
Integrating the Bioinformatic Technology Group into your research programme Introduction People and Skills Examples Integrating the BTG Contacts BHRC Away.
Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR - Proteomics.
BIG Data: Knowledge for Improving Vaccine Virus Selection Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Integration of Host Factor Data into the Virus Pathogen Database and Analysis Resource (ViPR) and the Influenza Research Database (IRD) Brett E. Pickett.
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
WMU CS 6260 Parallel Computations II Spring 2013 Presentation #1 about Semester Project Feb/18/2013 Professor: Dr. de Doncker Name: Sandino Vargas Xuanyu.
BRC 2011 Session #4 – “Omics” Data. Session #4 - Outline Challenges and Opportunities  pathogen datasets; host datasets; integrating pathogen-host datasets.
Valentina Di Francesco Senior Program Officer for Bioinformatics, Structural Genomics and Systems Biology Microbial Genomics.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR.
Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou
Bioinformatics and Computational Biology
Trends Biomedical In silico. “Omics” a variety of new technologies help explain both normal and abnormal cell pathways, networks, and processes simultaneous.
داده های عظیم در دوران پساژنوم Big Data in Post Genome Era مهدی صادقی پژوهشگاه ملی مهندسی ژنتیک و زیست فناوری پژوهشکده علوم زیستی، پژوهشگاه دانش های بنیادی.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
High throughput biology data management and data intensive computing drivers George Michaels.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
基于 R/Bioconductor 进行生物芯片数据分析 曹宗富 博奥生物有限公司
BME435 BIOINFORMATICS.
Bioinformatics Overview
생물정보학 Bioinformatics.
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
What is “Biomedical Informatics”?
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Introduction to Bioinformatic
Session 1: WELCOME AND INTRODUCTIONS
Presentation transcript:

Introduction to Bioinformatics Richard H. Scheuermann, Ph.D. Director of Informatics JCVI

Outline What is Bioinformatics? – Some definitions – Data types and analysis objectives Big Data – The Big Data value proposition The Power of Bioinformatics – Extracting knowledge from data – DMID Systems Biology data in the Bioinformatics Resource Centers

What is Bioinformatics? And related terms – biomedical informatics, computational biology, systems biology Wikipedia – Bioinformatics: an interdisciplinary field that develops and improves on methods for storing, retrieving, organizing and analyzing biological data. A major activity in bioinformatics is to develop software tools to generate useful biological knowledge. NIH Biomedical Information Science and Technology Initiative Consortium (BISTIC) – Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. – Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.

What is Bioinformatics? And related terms – biomedical informatics, computational biology, systems biology Wikipedia – Bioinformatics: an interdisciplinary field that develops and improves on methods for storing, retrieving, organizing and analyzing biological data. A major activity in bioinformatics is to develop software tools to generate useful biological knowledge. NIH Biomedical Information Science and Technology Initiative Consortium (BISTIC) – Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. – Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.

Biological data types and analysis objectives Genomics – Nucleotide genome sequences, metagenomic sequences – Gene finding, functional annotation, sequence alignment, homology determination, comparative analysis, phylogenetic inferencing, association analysis, mutation functional prediction, species distribution analysis Transcriptomics – RNA expression levels, transcription factor binding, chromatin structure information – Differential expression, clustering, functional enrichment, transcriptional regulation/causal reasoning Proteomics – Proteins levels, protein structures, protein interactions – Protein identification, protein functional predictions, structural predictions, structural comparison, molecular dynamic simulation, mutation functional prediction, docking predictions, network analysis Metabolomics – Metabolite/small molecule levels – Pathway/network analysis Imaging – Microscopy images, MRI images, CT scans – Feature extraction, high content screening Cytometry – Cell levels, cell phenotypes – Cell population clustering, cell biomarker discovery Systems biology – All of the above – Network analysis, causal reasoning, reverse causal reasoning, drug target prediction, regulatory network analysis, information flow, population dynamics, modeling and simulation

Big Data BIG DATA

BD2K

Big Data Volumes

Big Data 3 V’s

Data Levels in Biological Research

Primary dataDerived data

Primary dataDerived data Interpreted data/ knowledge Experimental metadata Analytical metadata

Big Data in Biology

Variety

No Variety

Big Data Volume + Variety = Value Variety = Metadata

DMID Genomics Courtesy of Alison Yao, DMID

Bioinformatics Resource Centers (BRCs)

DMID Systems Biology Program

Systems Biology of Viral Infection Systems Virology (Michael Katze group, Univ. Washington) – Influenza H1N1 and H5N1 and SARS Coronavirus – Statistical models, algorithms and software, raw and processed gene expression data, and proteomics data Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed) – Various influenza viruses – Microarray, mass spectrometry, and lipidomics data

Data Dissemination Working Group Representatives from SysBio programs and relevant BRCs Jeremy Zucker

“Omics” Data Management Biosamples Cells/organisms Treated Samples Primary Data Processed Data Matrix Biosets 1. Biosamples Cells/organisms Treated Samples Primary Data Processed Data Matrix Biosets 2. Pathogen treatment Assay 1 Data processing Data interpretation Project Metadata Assay 2

“Omics” data management (MIBBI) – Project metadata (1 template) Title, PI, abstract, publications – Experiment metadata (~6 templates) Biosamples, treatments, reagents, protocols, subjects – Primary results data Raw expression values – Processed data Data matrix of fold changes and p-values – Data processing metadata (1 template) Normalization and summarization methods – Interpreted results (Host factor biosets) Interesting gene, protein and metabolite lists – Data interpretation metadata (1 template) Fold change and p-value cutoffs used Visualize biosets in context of biological pathways and networks Statistical analysis of pathway/sub-network overrepresentation Strategy for Handling “Omics” Data

Data Submission Workflows Study metadata Experiment metadata Primary results Analysis metadata Processed data matrix Free text metadata GEO/PRIDE/PNNL/SRA/MetaboLights ViPR/IRD/PATRIC Host factor bioset pointer submission pointer Systems Biology sites

IRD Home Page

Live Demo

35 transcriptomic, 16 proteomic, 4 lipidomic experiments 2845 experiment samples 590 biosets 24 viral (flu, SARS, MERS) and 2 non-viral agents

Reactome section

Summary of “Omics” Data Support in IRD/ViPR Structured metadata about study, experiments, analysis methods Series of derived biosets Boolean analysis of biosets from different experiments Biosets based on expression patterns Search for expression patterns of specific genes Access to complete data matrix Data linkout to pathway knowledgebase

Big Data to Knowledge Volume + Variety = Value Variety = Metadata Data + Metadata + Interpretation = Knowledge

Acknowledgement Lynn Law, Richard Green - U. Washington Peter Askovich - Seattle Biomed Brian Aevermann, Brett Pickett, Doug Greer, Yun Zhang - JCVI Entire Systems Biology Data Dissemination Working Group, especially Jeremy Zucker NIAID (Alison Yao and Valentina DiFrancesco) Entire ViPR/IRD development team at JCVI and Northrop Grumman NIAID/NIH - N01AI40041