KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Advertisements

Wrapup. NHGRI strategic plan What does the NIH think genomics should be for the next 10 years? [Nature, Feb. 2011]
Structural Genomics – an example of transdisciplinary research at Stanford Goal of structural and functional genomics is to determine and analyze all possible.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
What Do Toxicologists Do?
Cancer is heterogeneous disease! -> enabled characterization of new tumor subtypes for improving personalized treatment and ultimately achieving better.
Chapter 13. The Impact of Genomics on Antimicrobial Drug Discovery and Toxicology CBBL - Young-sik Sohn-
Introduction to Pharmacoinformatics
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
NIH Big Data to Knowledge (BD2K) March 4, 2014 Peter Lyster National Institute of General Medical Sciences (NIGMS) NIH.
David S. Ebert David S. Ebert Visual Analytics to Enable Discovery and Decision Making: Potential, Challenges, and.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
NIH Council of Councils Meeting November 21, 2008 LINCS Library of Integrated Network-based Cellular Signatures.
ASCAC-BERAC Joint Panel on Accelerating Progress Toward GTL Goals Some concerns that were expressed by ASCAC members.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
ECCR Overview/MLSCN. NIH Roadmap Series of initiatives designed to pursue major opportunities in biomedical research and gaps in current knowledge that.
Center for Causal Discovery (CCD) of Biomedical Knowledge from Big Data University of Pittsburgh Carnegie Mellon University Pittsburgh Supercomputing Center.
Center for Predictive Computational Phenotyping (CPCP): Training Plans May 15, 2015 Debora Treu and Whitney Sweeney Center for Predictive Computational.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
University of Pavia Dep. of Electrical, Computer and Biomedical Engineering Laboratory of Bioinformatics, Mathematical Modelling and Synthetic Biology.
Genomic Medicine Grid Juan Pedro Sánchez Merino Instituto de Salud Carlos III
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Compiling Information and Inferring Useful Knowledge for Systems Biology by Text Mining the Literature Anália Lourenço IBB – Institute for Biotechnology.
TDM in the Life Sciences Application to Drug Repositioning *
High-throughput genomic profiling of tumor-infiltrating leukocytes
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
Sungkyunkwan University, School of Medicine.
Knowledge-Guided Analysis with KnowEnG Lab
National Healthcare Science Week 2017
Biological Databases By: Komal Arora.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
MATLAB Distributed, and Other Toolboxes
Areas of Research Xia Jiang Associate Professor of
Building a community for genome and proteome annotation
Data-Drive Analytics for Precision Medicine
Data challenges in the pharmaceutical industry
Knowledge Engine for Genomics (KnowEnG):
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
ATOM Accelerating Therapeutics for Opportunities in Medicine
Harry Hochheiser Assistant Professor
electronic PharmacoGenomics Assistant (ePGA)
NCI’s Genomics Data Commons (GDC) & NCI Cloud Pilots
Dept of Biomedical Informatics University of Pittsburgh
Population Information Integration, Analysis and Modeling
Model-Driven Analysis Frameworks for Embedded Systems
Functional Annotation of the Horse Genome
CICC Combines Grid Computing with Chemical Informatics
Areas of Research Xia Jiang Assistant Professor
To learn more, visit The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions.
XtremeData on the Microsoft Azure Cloud Platform:
Benjamin Wooden, Nicolas Goossens, Yujin Hoshida, Scott L. Friedman 
Theme: Translational Research - Conversation between Clinical and Basic Scientists Group 4 !
Enabling ML Based Research
Batyr Charyyev.
Implementing Genome-Driven Oncology
Automated Analysis and Code Generation for Domain-Specific Models
Three major barriers to the integration of metagenomics into pharmacology and toxicology. Three major barriers to the integration of metagenomics into.
Joel T. Dudley, Atul J. Butte  Gastroenterology 
Service-enabling Biomedical Research Enterprise
Knowledge-Guided Sample Clustering
Stanford AI in Medicine Imaging Center.
Cancer Challenge Area: Hypothesis Generation Using Machine Learning Amber Simpson, Memorial Sloan Kettering Cancer Center Jeremy Goecks, Oregon.
Presentation transcript:

KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA Charles Blatti Biomedical Applications Cancer Pharmacogenomics: Genomic profiling of high-risk breast cancer patients before and after drug therapy. Predict drug response from molecular and genetic profiling of patients and identify most essential signatures of response KnowEnG Goals Build and publicly deploy a Knowledge Engine for Genomics that enables users to perform analysis of their experimental data in the context of Big Data community knowledge sources. Leverage computational expertise in data mining, machine learning, and scalable/distributed learning technologies to develop data-driven cyberenvironments of the future for biomedical scientists and clinicians to generate and evaluate novel hypotheses and insights about their data. Comparative Transcriptomics: Genomic profiling of regions in brains of social animals during specific behaviors. Identify gene modules and transcription factors that are play a key role in social behavior and psychology About KnowEnG Genotype to Phenotype: Metabolic profiling and genomic sequencing of 500 strains of Actinomycetes bacteria Predict the strains that are likely to produce novel compounds with potential antibiotic activity KnowEnG is one of 11 Centers of Excellence in Big Data Computing funded by NIH in 2014 as part of their Big Data to Knowledge (BD2K) Initiative. It brings together researchers from the University of Illinois and the Mayo Clinic to design and test an E-science framework for genomics Components of the Knowledge Engine AIM 2: Develop efficient algorithms to analyze user experimental data in the form of a Analysis Matrix (AM) in the context of the KN Focus on core operations with general applicability to biological data sets including classification of AM columns, AM or KN-derived feature selection, simultaneous clustering of AM and KN, etc. I/UCRC Extensions AIM 1: Transform community data into massive, heterogeneous Knowledge Network (KN) Automatically extract entitles and relationships from relevant sources into KN representation including data from newly sequenced species and from high-throughput experimental assays AIM 3: Efficient implementation of algorithms that scale on rapidly growing KN. Ultimately deployed on commercial cloud platform and accessible to users worldwide AIM 4: Develop interface to guide user analysis methods and access to scalable computational resources Incorporate tools to visual results in context of KN and relate findings to existing literature