Optimizing Biological Data Integration

Optimizing Biological Data Integration
Bioinformatics depends not just on the numbers, but also on correct molecular identification (MI) and mapping between high-throughput platforms. But the databases and algorithms providing MI disagree wildly. TCGA (The Cancer Genome Atlas) provides 1000’s of samples and over a dozen platforms for data integration. Integrating both samples and semantics gives us a way to measure the accuracy of MI for filtering and mapping. In this way we can evaluate and compare data prep strategies: ID mapping among genes, transcripts, and proteins. Algorithms for predicting microRNA targets, for aligning NGS data with reference genomes, for calling copy number variations, etc. Integrating multiple-platform data correctly will open up a new level of comprehensive systems biology modeling. We have built some bioconductor R packages to support this work, and published our first application. We look to greatly expand the scope, to aid bioinformaticians and curators. We pre-process raw data into processed data ripe for answering medical and biological questions. We pre-process raw data into processed data ripe for answering medical and biological questions. MEANINGFUL Translational Bioinformatics & Systems Biology Good choices Bad choices Not so meaningful… pre-processing choices (annotation, ID mapping, filtering, algorithms,…) data for analysis & modeling raw data MEANINGFUL Translational Bioinformatics & Systems Biology Good choices Bad choices Not so meaningful… pre-processing choices (annotation, ID mapping, filtering, algorithms,…) data for analysis & modeling raw data

Optimizing Biological Data Integration

Similar presentations

Presentation on theme: "Optimizing Biological Data Integration"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Optimizing Biological Data Integration

Similar presentations

Presentation on theme: "Optimizing Biological Data Integration"— Presentation transcript:

Similar presentations

About project

Feedback