Download presentation
Presentation is loading. Please wait.
Published byMary Preston Modified over 6 years ago
2
Life Sciences Integrated Demo Senior Product Manager, Life Sciences
Joyce Peng Senior Product Manager, Life Sciences Oracle Corporation Thanks for Charlie’s introduction. Today I am going to walk you through a life sciences integrated demo of in silico drug discovery to find a cure for lymphoma cancer. Because I have a lot to cover, I’ll take questions in the end of the presentation.
3
Informatics Challenges
Manage vast quantities of data Access heterogeneous Data Access heterogeneous data Informatics Challenges Collaborate securely Integrate a variety of data types Find Patterns and insights Charlie mentioned the 5 informatics challenges in life sciences.
4
Oracle Life Sciences Platform
Transparent Gateways Fast access using Oracle OCI Distributed Queries Perform searches across domains Generic Gateways Access any data using ODBC e.g. MySQL GenBank e.g. PubMed External Tables Ability to index and query external files UltraSearch Search external sites & repositories MySQL Toolkit Easily move MySQL data into Oracle Real Application Clusters Linear scalability Oracle Portal Build personalized portals Application Server Provide scalability for the middle tier XML DB Flexibly manage data interMedia Store & manage images Security Enforce security Auditing Create audit trail to facilitate FDA compliance Workflow Automate laboratory & business processes Collaboration Suite Collaborate securely iFS/Files Share documents e.g. SwissProt SP-ML Data Mining Discover patterns & insights BLAST Sequence similarity search Network Model Pathways Modeling Statistics Perform basic statistics Table Functions Implement complex algorithms OLAP & Discoverer Interactive query & drill-down And showed you this slide of our platform features. Extensibility Framework (Data cartridges), manage complex scientific data LOBs Manage unstructured data Text Index & query text, e.g. literature searches SQL Loader High performance data loader Web Services Standard communication between applications Merge/Upsert Enabling update and insert in one step Transportable Tablespaces Rapidly exchange tables Oracle Streams Rule-based subscription for information sharing
5
Platform Features Highlighted
Transparent Gateways Fast access using Oracle OCI Distributed Queries Perform searches across domains Generic Gateways Access any data using ODBC e.g. MySQL GenBank e.g. PubMed External Tables Ability to index and query external files UltraSearch Search external sites & repositories MySQL Toolkit Easily move MySQL data into Oracle Real Application Clusters Linear scalability Oracle Portal Build personalized portals Application Server Provide scalability for the middle tier XML DB Flexibly manage data interMedia Store & manage images Security Enforce security Auditing Create audit trail to facilitate FDA compliance Workflow Automate laboratory & business processes Collaboration Suite Collaborate securely iFS/Files Share documents e.g. SwissProt SP-ML Data Mining Discover patterns & insights BLAST Sequence similarity search Network Model Pathways Modeling Statistics Perform basic statistics Table Functions Implement complex algorithms OLAP & Discoverer Interactive query & drill-down In the integrated demo, I am going to highlight these particular features underlined. Now let’s get started. Extensibility Framework (Data cartridges), manage complex scientific data LOBs Manage unstructured data Text Index & query text, e.g. literature searches SQL Loader High performance data loader Web Services Standard communication between applications Merge/Upsert Enabling update and insert in one step Transportable Tablespaces Rapidly exchange tables Oracle Streams Rule-based subscription for information sharing
6
BioOracle Project We are scientists at a life sciences company looking to find a cure for Lymphoma We are all scientists at a life sciences company called BioOracle. Our mission is to try to find a cure for lymphoma cancer.
7
BioOracle Portal So the first place that we start is our Portal home page. It is possible to set up your home page to include whichever portlets you want. In this example we have a workflow notification, a calendar, a to do list, a section for that day’s appointments, as we ll as access to my files and applications. Oracle Portal provides single sign on so you just need to log in once. Integrated data view and Single-Sign-On to many applications
8
Find a Cure for Lymphoma
Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead The first step in our research is that we are going to do some literature search on lymphoma. So we launch our text search application form the portal.
9
Literature Search Search document content.
We are using the Oracle Text feature to do the text searching. Oracle Text is included with the database. Here you can see that we searched in the main body of the text for the word ‘lymphoma’. Oracle Text supports more that 150 file types. You can search across them. In the left window we have retrieved a list of documents and in the right window we can see the the contents the selected document from scientific American Medicine.
10
Extract Document Themes
Now let’s focus on the window on the right side. You can use Oracle Text to extract the themes of a document. For example, this document from Scientific American Medicine is about diseases, lungs and drugs.
11
Generate the Gist You can also generate the gist (which is a summary) of a document.
12
Categorize Documents Now let’s look at the left window. You can use Oracle Text to categorize your document. Here we are using a medical thesaurus called Mesh to categorize the documents.
13
Text Mining You can also do text mining. Here are we using k-means algorithm to cluster and classify the documents.
14
Find a Cure for Lymphoma
Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead Now we have identified several papers we are interested in. We want to set up a project workspace in Oracle Files where we can put all the information related to our research.
15
BioOracle Project In Oracle Files
Oracle files is a feature of collaboration suite. Oracle files lets you store documents in a way that is very similar to what you would do with MSTF folders. However, you are actually putting the data into the database. Notice that you can put all kinds of files into Oracle Files, your excel spread sheet, your images, etc. Lymphoma project workspace after adding documents
16
BioOracle Project in Oracle Files
Oracle Files also provides revision control. Support revision control
17
BioOracle Project in Oracle Files
You can group a group of metadata in a category and the categories with a particular document. For example, here we associate the “records manage and discovery” category with the supplementary_information.doc document. Associate metadata (Categories) to a document.
18
BioOracle Project in Oracle Files
Now you can search based on categories. Oracle Files uses Oracle Text to do searches. Therefore, you can also do content search across all your files. You can see that Oracle Files support many different kinds of languages. Advanced Search
19
Approval Workflow Oracle Files is also integrated with Oracle Workflow. Once you have done a whole day of experiments, you can compile your result in your electronic notebook and store it as a pdf file. You can then send for your supervisor for approval. If you click the workflow link here, you can launch Oracle Workflow.
20
Approval Workflow This is an example Workflow. Workflow allows you to graphically define your business processes. For example, you can define a document rounting and approval workflow.
21
BioOracle Project in Oracle Files
Oracle Files also provides access control. Only these people can see the lymphoma research project and each person has different privileges. Access Control
22
BioOracle Project in Oracle Files
Support HTTP/WebDAV(Web) SMB (Windows) NFS (UNIX) AFP (Apple Mac) FTP protocols You can also access files stored in the database with familiar interfaces such as Windows Explorer. Here we are looking at the same folder through Web folder with WebDav protocols. Oracle Files supports a variety of protocols including: You can also FTP files and folders into Oracle. Having Oracle managing your files, you can leverage all the benefit of the database. For example, if you accidentally delete a file, you can restore it. Your file is also securely managed by Oracle.
23
Wireless Access Oracle Collaboration Suite provides you wireless access to the same project.
24
Highly Scalable, Worldwide Access
We, Oracle Corporation, uses Oracle Files in our worldwide operations. You can see that in June 30, there were 17 million documents in the system. The total space used was 4 TB. There were 59 thousand users created. And there were 24 thousand distinct users connected to the system with the concurrent connects being You can see that Oracle files is a highly scalable system with worldwide access.
25
Find a Cure for Lymphoma
Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead So now that we have found some interesting literature, and set up a project workspace, we now want to set up a meeting with a collaborator to discuss next steps.
26
Calendar To do that we are using the Calendar tool in Collaboration Suite Use calendar in Collaboration Suite to schedule meetings with collaborators
27
Internet Meeting However, we discover that the person that we want to collaborate with isn’t in the same location as we are in, so we decide to have an iMeeting. This is also a feature of Colalboration Suite.
28
Protocol Sharing The collaborator mentioned that he has a protocol that works especially well for staining cells, so we ask them to send us a copy of the protocol. As the protocol is a hard copy, we decide to send it by fax. Using collaboration Suite it is possible to receive faxes by , along with our regular and our voice mail.
29
Find a Cure for Lymphoma
Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead
30
BioOracle Image Management
We also learnt from our collaborator that there are some interesting cell histology images that have already been taken for Lymphoma, so we decide to have a look at those. The feature that we are using for storing images is InterMedia, a feature of Oracle database. Here we are loading a image into Oracle and we can store additional information about the image, Use interMedia to manage and query Lymphoma histology data
31
BioOracle Image Management
Here we see is a list of image thumbnails generated by Oracle interMedia. If we click one of the thumnails, Generate image thumbnails
32
BioOracle Image Management
We could retrieve the image. Notice that the height and width here are automatically extracted by interMedia from the images.. interMendia allows you to do integrated query across your regular relation data, such as project name, scientist name, and metadata associated with the images. You can also add your own annotations to your image Integrated search across relational data and image attributes extracted
33
Molecular Pattern Recognition Filtering and Pre-Processing
DLBC Follicular Gene Expression Analysis for Lymphoma Biopsies Samples Feature Selection SQL Oracle Data Mining Molecular Pattern Recognition Oracle Data Mining Bayesian Classifier Interpretation of Results Discoverer Reports Portals Java Servlets Filtering and Pre-Processing SQL, XML, Java Instruments From reading the literature we discovered that there are 2 main forms of lymphoma with very different treatment methods. They are DLBC and follicular lymphoma. If you look at the cell images, you will see that it is difficult to distinguish the two. So we would like to develop a diagnostic test to determine more easily which form of cancer a patient has. We decide to use an Affymetrix chip for a gene expression study. Once we have the data we put it straight into the Oracle database. We then manipulate the data in the database to prepare it for data mining. By using the analytical capabilities of the database, we aren’t taking it out of it’s secure environment, which means that there is less chance that we will lose the data, or have regulatory problems moving forward. Data mining can help us to distinguish the two cancer subtypes. Data mining could also help us to identify genetic markers of lymphoma. These markers can be potential targets to develop a lead against. Affymetrix Microarray Use analytical pipeline to identify the patterns that differentiate DLBC from Follicular Lymphoma Prediction: DLBC Follicular Dataset from Golub et al Science 286:
34
Find a Cure for Lymphoma
Literature search on Lymphoma Set up a project workspace Set up a meeting Check lab protocols Store cell histology images Analyze gene expression results Study the markers Find a lead Now let’s look at how you can use Oracle data mining to analyze gene expression data.
35
Oracle Data Mining Classification of Cancer Subtypes (DLBC versus Follicular)
We are going to use the wizards that are available with Oracle Data Mining to try to classify the samples into DLBC or follicular lymophoma. We have 77 patients about about 7000 genes. Oracle provides wizards to guide analysts through data mining model creation
36
Oracle Data Mining Build a classification model
The first step is to build a model, and then train it to recognise a certain set of attributes or features. So we have clicked on ODM, and have selected the ‘Classification Model Build’. Build a classification model
37
Oracle Data Mining Working through the GUI wizard. You now need to choose the target field, which is DLBC or Follicular. Select the target field, e.g. DLBC or Follicular Lymphoma
38
Oracle Data Mining Select the classification model
You then need to select which algorithm you would like to use in the model. We decide to use Naïve Bayes because it is very fast. Select the classification model
39
Oracle Data Mining Test the model on the data set of interest
Now you have built and trained your model, you want to see how well it works. So you run the model on some data where you already know the outcome. Test the model on the data set of interest
40
Naïve Bayes has built a model that distinguishes DLBC from Folicular with 77% accuracy
This slide shows the outcome of the test. This confusion matrix shows how many times the model’s predictions were accurate. The result is that it is accurate 77% of the time. However, you want to have higher accuracy than that. The confusion matrix shows the number of times the model’s predictions are accurate
41
Oracle Data Mining So you decide to try the Adaptive Bayes Network algorithm, which is a better algorithm. See if the Adaptive Bayes Network algorithm can build a better model
42
Oracle Data Mining ABN allows you select a series of parameters that best describe your data. Use wizards to define parameters for building a model
43
Oracle Data Mining This time when you run the model you get a result that is 84% accurate., which is a much better score. Adaptive Bayes Network algorithm can predict Lymphoma subtype with 84% accuracy
44
Oracle Data Mining The ABN algorithm can also generate rules for you to interpret the model. Adaptive Bayes Network algorithm generates rules for model interpretation
45
Oracle Data Mining in JDeveloper
A key advantage of Oracle data mining is that once you have done the above step, Jdeveloper will automatically generate code for you. You can reuse the Java code in your analytic pipeline. Automatically create the Java code needed to build analytical pipelines inside the database
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.