Life Sciences Integrated Demo Senior Product Manager, Life Sciences

Slides:



Advertisements
Similar presentations
ASIAES Project Overview Satellite Image Network for Natural Hazard Management in ASEAN+3 region Pakorn Apaphant Geo-Informatics and Space Technology Development.
Advertisements

DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Copyright © 2006, SAS Institute Inc. All rights reserved. Enterprise Guide 4.2 : A Primer SHRUG : Spring 2010 Presented by: Josée Ranger-Lacroix SAS Institute.
Overview of SQL Server Alka Arora.
Multiple Examples of tumor tissue (public data from Whitehead/MIT) SVM Classification of Multiple Tumor Types DNA Microarray Data Oracle Data Mining 78.25%
Life Sciences Integrated Demo Joyce Peng Senior Product Manager, Life Sciences Oracle Corporation
Using the SAS® Information Delivery Portal
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
AUTOMATION OF WEB-FORM CREATION - KINNERA ANGADI – MS FINAL DEFENSE GUIDANCE BY – DR. DANIEL ANDRESEN.
DEPICT: DiscovEring Patterns and InteraCTions in databases A tool for testing data-intensive systems.
Oracle Data Mining Update and Xerox Application Charlie Berger Sr. Director of Product Management, Life Sciences and Data Mining
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Information Builders : SmartMart Seon-Min Rhee Visualization & Simulation Lab Dept. of Computer Science & Engineering Ewha Womans University.
Robin Mullinix Systems Analyst GeorgiaFIRST Financials PeopleSoft Query: The Next Step.
Reporting and Analysis With Microsoft Office. Reporting and Analysis Business User Reporting & Analysis OLAP Data Warehouse.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
Business Productivity Infrastructure Optimization Campaign 1 Agenda: BPIO Partner Sales Readiness Workshop Day 3: Topic: Enterprise Content management.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
Introduction to SQL Server 2000 Reporting Services Jeff Dumas Technical Specialist Microsoft Corporation
Connecting to External Data. Financial data can be obtained from a number of different data sources.
HTML PROJECT #1 Project 1 Introduction to HTML. HTML Project 1: Introduction to HTML 2 Project Objectives 1.Describe the Internet and its associated key.
 INDEX  Overview.  Introduction.  System Requirement.  Features Of SQL.  Development Process.  System Design (SDLC).  Implementation.  Future.
Introduction to Oracle Forms Developer and Oracle Forms Services
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
Section 2.1 Section 2.2 Identify hardware
Data Platform and Analytics Foundational Training
Business process management (BPM)
Working in the Forms Developer Environment
Defining Data Warehouse Concepts and Terminology
Using E-Business Suite Attachments
Chapter 1 Introduction to HTML.
Introduction to Oracle Forms Developer and Oracle Forms Services
An Artificial Intelligence Approach to Precision Oncology
Business process management (BPM)
Project 1 Introduction to HTML.
Introduction to Oracle Forms Developer and Oracle Forms Services
PowerMart of Informatica
Extensible Platform Microsoft Dynamics 365
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Migrating Oracle Forms Using Oracle Application Express
Defining Data Warehouse Concepts and Terminology
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
PolyAnalyst Data and Text Mining tool
CONTENT: Introduction of the evolution of enterprise portals.
Data Warehousing and Data Mining
Supporting End-User Access
Code Analysis, Repository and Modelling for e-Neuroscience
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Web Mining Department of Computer Science and Engg.
2/24/2019 6:15 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Tutorial 7 – Integrating Access With the Web and With Other Programs
Computer Literacy BASICS
The ultimate in data organization
Chapter 3 Database Management
Jonathan Griffin, Managing Director, IFIS Publishing &
Code Analysis, Repository and Modelling for e-Neuroscience
Mark Quirk Head of Technology Developer & Platform Group
9/8/ :03 PM © 2006 Microsoft Corporation. All rights reserved.
Contract Management Software 100% Cloud-Based ContraxAware provides you with a deep set of easy to use contract management features.
Integrated Statistical Production System WITH GSBPM
Presentation transcript:

Life Sciences Integrated Demo Senior Product Manager, Life Sciences Joyce Peng Senior Product Manager, Life Sciences Oracle Corporation Yao-chun.Peng@oracle.com Thanks for Charlie’s introduction. Today I am going to walk you through a life sciences integrated demo of in silico drug discovery to find a cure for lymphoma cancer. Because I have a lot to cover, I’ll take questions in the end of the presentation.

Informatics Challenges Manage vast quantities of data Access heterogeneous Data Access heterogeneous data Informatics Challenges Collaborate securely Integrate a variety of data types Find Patterns and insights Charlie mentioned the 5 informatics challenges in life sciences.

Oracle Life Sciences Platform Transparent Gateways Fast access using Oracle OCI Distributed Queries Perform searches across domains Generic Gateways Access any data using ODBC e.g. MySQL GenBank e.g. PubMed External Tables Ability to index and query external files UltraSearch Search external sites & repositories MySQL Toolkit Easily move MySQL data into Oracle Real Application Clusters Linear scalability Oracle Portal Build personalized portals Application Server Provide scalability for the middle tier XML DB Flexibly manage data interMedia Store & manage images Security Enforce security Auditing Create audit trail to facilitate FDA compliance Workflow Automate laboratory & business processes Collaboration Suite Collaborate securely iFS/Files Share documents e.g. SwissProt SP-ML Data Mining Discover patterns & insights BLAST Sequence similarity search Network Model Pathways Modeling Statistics Perform basic statistics Table Functions Implement complex algorithms OLAP & Discoverer Interactive query & drill-down And showed you this slide of our platform features. Extensibility Framework (Data cartridges), manage complex scientific data LOBs Manage unstructured data Text Index & query text, e.g. literature searches SQL Loader High performance data loader Web Services Standard communication between applications Merge/Upsert Enabling update and insert in one step Transportable Tablespaces Rapidly exchange tables Oracle Streams Rule-based subscription for information sharing

Platform Features Highlighted Transparent Gateways Fast access using Oracle OCI Distributed Queries Perform searches across domains Generic Gateways Access any data using ODBC e.g. MySQL GenBank e.g. PubMed External Tables Ability to index and query external files UltraSearch Search external sites & repositories MySQL Toolkit Easily move MySQL data into Oracle Real Application Clusters Linear scalability Oracle Portal Build personalized portals Application Server Provide scalability for the middle tier XML DB Flexibly manage data interMedia Store & manage images Security Enforce security Auditing Create audit trail to facilitate FDA compliance Workflow Automate laboratory & business processes Collaboration Suite Collaborate securely iFS/Files Share documents e.g. SwissProt SP-ML Data Mining Discover patterns & insights BLAST Sequence similarity search Network Model Pathways Modeling Statistics Perform basic statistics Table Functions Implement complex algorithms OLAP & Discoverer Interactive query & drill-down In the integrated demo, I am going to highlight these particular features underlined. Now let’s get started. Extensibility Framework (Data cartridges), manage complex scientific data LOBs Manage unstructured data Text Index & query text, e.g. literature searches SQL Loader High performance data loader Web Services Standard communication between applications Merge/Upsert Enabling update and insert in one step Transportable Tablespaces Rapidly exchange tables Oracle Streams Rule-based subscription for information sharing

BioOracle Project We are scientists at a life sciences company looking to find a cure for Lymphoma We are all scientists at a life sciences company called BioOracle. Our mission is to try to find a cure for lymphoma cancer.

BioOracle Portal So the first place that we start is our Portal home page. It is possible to set up your home page to include whichever portlets you want. In this example we have a workflow notification, a calendar, a to do list, a section for that day’s appointments, as we ll as access to my files and applications. Oracle Portal provides single sign on so you just need to log in once. Integrated data view and Single-Sign-On to many applications

Find a Cure for Lymphoma Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead The first step in our research is that we are going to do some literature search on lymphoma. So we launch our text search application form the portal.

Literature Search Search document content. We are using the Oracle Text feature to do the text searching. Oracle Text is included with the database. Here you can see that we searched in the main body of the text for the word ‘lymphoma’. Oracle Text supports more that 150 file types. You can search across them. In the left window we have retrieved a list of documents and in the right window we can see the the contents the selected document from scientific American Medicine.

Extract Document Themes Now let’s focus on the window on the right side. You can use Oracle Text to extract the themes of a document. For example, this document from Scientific American Medicine is about diseases, lungs and drugs.

Generate the Gist You can also generate the gist (which is a summary) of a document.

Categorize Documents Now let’s look at the left window. You can use Oracle Text to categorize your document. Here we are using a medical thesaurus called Mesh to categorize the documents.

Text Mining You can also do text mining. Here are we using k-means algorithm to cluster and classify the documents.

Find a Cure for Lymphoma Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead Now we have identified several papers we are interested in. We want to set up a project workspace in Oracle Files where we can put all the information related to our research.

BioOracle Project In Oracle Files Oracle files is a feature of collaboration suite. Oracle files lets you store documents in a way that is very similar to what you would do with MSTF folders. However, you are actually putting the data into the database. Notice that you can put all kinds of files into Oracle Files, your excel spread sheet, your images, etc. Lymphoma project workspace after adding documents

BioOracle Project in Oracle Files Oracle Files also provides revision control. Support revision control

BioOracle Project in Oracle Files You can group a group of metadata in a category and the categories with a particular document. For example, here we associate the “records manage and discovery” category with the supplementary_information.doc document. Associate metadata (Categories) to a document.

BioOracle Project in Oracle Files Now you can search based on categories. Oracle Files uses Oracle Text to do searches. Therefore, you can also do content search across all your files. You can see that Oracle Files support many different kinds of languages. Advanced Search

Approval Workflow Oracle Files is also integrated with Oracle Workflow. Once you have done a whole day of experiments, you can compile your result in your electronic notebook and store it as a pdf file. You can then send for your supervisor for approval. If you click the workflow link here, you can launch Oracle Workflow.

Approval Workflow This is an example Workflow. Workflow allows you to graphically define your business processes. For example, you can define a document rounting and approval workflow.

BioOracle Project in Oracle Files Oracle Files also provides access control. Only these people can see the lymphoma research project and each person has different privileges. Access Control

BioOracle Project in Oracle Files Support HTTP/WebDAV(Web) SMB (Windows) NFS (UNIX) AFP (Apple Mac) FTP protocols You can also access files stored in the database with familiar interfaces such as Windows Explorer. Here we are looking at the same folder through Web folder with WebDav protocols. Oracle Files supports a variety of protocols including: You can also FTP files and folders into Oracle. Having Oracle managing your files, you can leverage all the benefit of the database. For example, if you accidentally delete a file, you can restore it. Your file is also securely managed by Oracle.

Wireless Access Oracle Collaboration Suite provides you wireless access to the same project.

Highly Scalable, Worldwide Access We, Oracle Corporation, uses Oracle Files in our worldwide operations. You can see that in June 30, there were 17 million documents in the system. The total space used was 4 TB. There were 59 thousand users created. And there were 24 thousand distinct users connected to the system with the concurrent connects being 866. You can see that Oracle files is a highly scalable system with worldwide access.

Find a Cure for Lymphoma Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead So now that we have found some interesting literature, and set up a project workspace, we now want to set up a meeting with a collaborator to discuss next steps.

Calendar To do that we are using the Calendar tool in Collaboration Suite Use calendar in Collaboration Suite to schedule meetings with collaborators

Internet Meeting However, we discover that the person that we want to collaborate with isn’t in the same location as we are in, so we decide to have an iMeeting. This is also a feature of Colalboration Suite.

Protocol Sharing The collaborator mentioned that he has a protocol that works especially well for staining cells, so we ask them to send us a copy of the protocol. As the protocol is a hard copy, we decide to send it by fax. Using collaboration Suite it is possible to receive faxes by e-mail, along with our regular e-mail and our voice mail.

Find a Cure for Lymphoma Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead

BioOracle Image Management We also learnt from our collaborator that there are some interesting cell histology images that have already been taken for Lymphoma, so we decide to have a look at those. The feature that we are using for storing images is InterMedia, a feature of Oracle database. Here we are loading a image into Oracle and we can store additional information about the image, Use interMedia to manage and query Lymphoma histology data

BioOracle Image Management Here we see is a list of image thumbnails generated by Oracle interMedia. If we click one of the thumnails, Generate image thumbnails

BioOracle Image Management We could retrieve the image. Notice that the height and width here are automatically extracted by interMedia from the images.. interMendia allows you to do integrated query across your regular relation data, such as project name, scientist name, and metadata associated with the images. You can also add your own annotations to your image Integrated search across relational data and image attributes extracted

Molecular Pattern Recognition Filtering and Pre-Processing DLBC Follicular Gene Expression Analysis for Lymphoma Biopsies Samples Feature Selection SQL Oracle Data Mining Molecular Pattern Recognition Oracle Data Mining Bayesian Classifier Interpretation of Results Discoverer Reports Portals Java Servlets Filtering and Pre-Processing SQL, XML, Java Instruments From reading the literature we discovered that there are 2 main forms of lymphoma with very different treatment methods. They are DLBC and follicular lymphoma. If you look at the cell images, you will see that it is difficult to distinguish the two. So we would like to develop a diagnostic test to determine more easily which form of cancer a patient has. We decide to use an Affymetrix chip for a gene expression study. Once we have the data we put it straight into the Oracle database. We then manipulate the data in the database to prepare it for data mining. By using the analytical capabilities of the database, we aren’t taking it out of it’s secure environment, which means that there is less chance that we will lose the data, or have regulatory problems moving forward. Data mining can help us to distinguish the two cancer subtypes. Data mining could also help us to identify genetic markers of lymphoma. These markers can be potential targets to develop a lead against. Affymetrix Microarray Use analytical pipeline to identify the patterns that differentiate DLBC from Follicular Lymphoma Prediction: DLBC Follicular Dataset from Golub et al Science 286:531-537.

Find a Cure for Lymphoma Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead Now let’s look at how you can use Oracle data mining to analyze gene expression data.

Oracle Data Mining Classification of Cancer Subtypes (DLBC versus Follicular) We are going to use the wizards that are available with Oracle Data Mining to try to classify the samples into DLBC or follicular lymophoma. We have 77 patients about about 7000 genes. Oracle provides wizards to guide analysts through data mining model creation

Oracle Data Mining Build a classification model The first step is to build a model, and then train it to recognise a certain set of attributes or features. So we have clicked on ODM, and have selected the ‘Classification Model Build’. Build a classification model

Oracle Data Mining Working through the GUI wizard. You now need to choose the target field, which is DLBC or Follicular. Select the target field, e.g. DLBC or Follicular Lymphoma

Oracle Data Mining Select the classification model You then need to select which algorithm you would like to use in the model. We decide to use Naïve Bayes because it is very fast. Select the classification model

Oracle Data Mining Test the model on the data set of interest Now you have built and trained your model, you want to see how well it works. So you run the model on some data where you already know the outcome. Test the model on the data set of interest

Naïve Bayes has built a model that distinguishes DLBC from Folicular with 77% accuracy This slide shows the outcome of the test. This confusion matrix shows how many times the model’s predictions were accurate. The result is that it is accurate 77% of the time. However, you want to have higher accuracy than that. The confusion matrix shows the number of times the model’s predictions are accurate

Oracle Data Mining So you decide to try the Adaptive Bayes Network algorithm, which is a better algorithm. See if the Adaptive Bayes Network algorithm can build a better model

Oracle Data Mining ABN allows you select a series of parameters that best describe your data. Use wizards to define parameters for building a model

Oracle Data Mining This time when you run the model you get a result that is 84% accurate., which is a much better score. Adaptive Bayes Network algorithm can predict Lymphoma subtype with 84% accuracy

Oracle Data Mining The ABN algorithm can also generate rules for you to interpret the model. Adaptive Bayes Network algorithm generates rules for model interpretation

Oracle Data Mining in JDeveloper A key advantage of Oracle data mining is that once you have done the above step, Jdeveloper will automatically generate code for you. You can reuse the Java code in your analytic pipeline. Automatically create the Java code needed to build analytical pipelines inside the database