Optimizing Biological Data Integration

Slides:



Advertisements
Similar presentations
Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
BIOINFORMATICS Ency Lee.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Bioinformatics “Other techniques raise more questions than they answer. Bioinformatics is what answers the questions those techniques generate.” SheAvery
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Flying to the Top, One Tweet at a Time: Using Social Media to Rank Online Search Results Robyn B. Reed, MA, MLIS Co-authors: Carrie L. Iwema, PhD, MLS.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
BI420 – Course information Web site: Instructor: Gabor Marth Teaching.
Bioinformatics. Analysis of proteomic data. Dr Richard J Edwards 28 August 2009; CALMARO workshop. ©Gary Larson (In not much detail)
MicroRNA genes Ka-Lok Ng Department of Bioinformatics Asia University.
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Bioinformatics Dr. Víctor Treviño BT4007
Bioinformatics and medicine: Are we meeting the challenge?
Data Analysis Summary. Elephant in the room General Comments General understanding that informatics is integral in medical sequencing and other –omics.
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
Jessica Dantzer Mooney Lab Center for Computational Biology and Bioinformatics Indiana University School of Medicine
Organizing information in the post-genomic era The rise of bioinformatics.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Genboree Discovery Process Integration Aleksandar Milosavljevic, PhD Baylor College of Medicine January 10 th, 2008; modified April 1 st 2008.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Gramene: Interactions with NSF Project on Molecular and Functional Diversity in the Maize Genome Maize PIs (Doebley, Buckler, Fulton, Gaut, Goodman, Holland,
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
Breakthrough Technology to Improve the Range and Accuracy of Cas9 Target.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Bioinformatics and Computational Biology
Introduction to caIntegrator caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
Data Management Support for Life Sciences or What can we do for the Life Sciences? Mourad Ouzzani
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
No reference available
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
Show & Tell Limsoon Wong Kent Ridge Digital Labs Singapore Role of Bioinformatics in the Genomic Era.
Bioinformatics Educated by Zhenglin Zhu School of Life Sciences, Chongqing U.
BME435 BIOINFORMATICS.
Bioinformatics Overview
Networks and Interactions
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
GraDe-SVM: Graph-Diffused Classification for the Analysis of Somatic Mutations in Cancer Morteza H.Chalabi, Fabio Vandin Hello.
Gil McVean Department of Statistics
An Artificial Intelligence Approach to Precision Oncology
Detect alternative splicing
Introduction to bioinformatics
High-throughput Biological Data The data deluge
Department of Genetics • Stanford University School of Medicine
Display of Near Optimal Sequence Alignments
Functional Annotation of the Horse Genome
Mangaldai College, Mangaldai
A bioinformatic analysis of microRNAs role in osteoarthritis
Proteomic-based integrated subject-specific networks in cancer
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
Protein Structures.
BIOINFORMATICS Summary
CISC 667 Intro to Bioinformatics (Spring 2007) Review session for Mid-Term CISC667, S07, Lec14, Liao.
Pan Du, Simon Lin Robert H. Lurie Comprehensive Cancer Center
Schematic representation of proteogenomic annotation strategy.
1.1.3 MI.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Session 1: WELCOME AND INTRODUCTIONS
Fraser Fraser 2000 Metzker 2010 Metzker 2010.
Cancer Cell Line Encyclopedia
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Optimizing Biological Data Integration Bioinformatics depends not just on the numbers, but also on correct molecular identification (MI) and mapping between high-throughput platforms. But the databases and algorithms providing MI disagree wildly. TCGA (The Cancer Genome Atlas) provides 1000’s of samples and over a dozen platforms for data integration. Integrating both samples and semantics gives us a way to measure the accuracy of MI for filtering and mapping. In this way we can evaluate and compare data prep strategies: ID mapping among genes, transcripts, and proteins. Algorithms for predicting microRNA targets, for aligning NGS data with reference genomes, for calling copy number variations, etc. Integrating multiple-platform data correctly will open up a new level of comprehensive systems biology modeling. We have built some bioconductor R packages to support this work, and published our first application. We look to greatly expand the scope, to aid bioinformaticians and curators. We pre-process raw data into processed data ripe for answering medical and biological questions. We pre-process raw data into processed data ripe for answering medical and biological questions. MEANINGFUL Translational Bioinformatics & Systems Biology Good choices Bad choices Not so meaningful… pre-processing choices (annotation, ID mapping, filtering, algorithms,…) data for analysis & modeling raw data MEANINGFUL Translational Bioinformatics & Systems Biology Good choices Bad choices Not so meaningful… pre-processing choices (annotation, ID mapping, filtering, algorithms,…) data for analysis & modeling raw data