CICC Chemical Compound Mining Workflows

Slides:



Advertisements
Similar presentations
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Advertisements

3/22/2006Community Grids Lab1 VOTable Services. 3/22/2006Community Grids Lab2 ServiceDescriptionInputOutput FileGenerator Service Combines clusterfile.
Indiana University School of David Wild – CICC Quarterly Meeting, Jan Page 1 Projects 1-4 update David Wild CICC Quarterly Meeting January 27.
CICC Web services and Issues Jungkee (Jake) Kim Community Grids Laboratory.
VARUNA – Towards a Grid- based Molecular Modeling Environment CICC/MACE – Meeting May 22, 2006 Mookie Baik Department of Chemistry & School of Informatics.
1 Overview of Chemical Informatics and Cyberinfrastructure Collaboratory Aug Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology.
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
Building a Chemical Informatics Grid Marlon Pierce Community Grids Laboratory Indiana University.
CICC Chemical Compound Mining Workflows Jungkee (Jake) Kim Community Grids Laboratory.
HANd : A New Transcoding Technique for PDA Browsers Enrique Costa Montenegro Departamento de Ingeniería Telemática ETSI Telecomunicación Universidad de.
In house Products Development tools. Techlead Development Tools Techlead developed many in house tools to assist speedy development List of tools PDF.
CG0119 Web Database Systems Parsing XML: using SimpleXML & XSLT.
Community Grids Lab CICC Activities Geoffrey Fox, Marlon Pierce Indiana University.
Chemical Informatics and Cyber- infrastructure Building Blocks Chemical Informatics Resources:  Deluge of experimental data > 100,000 compounds screened.
HyperContent 2.0 JA-SIG Winter Conference December 5, 2005 Alex Vigdor, Columbia University.
David Hoover Scientific Computing Branch, Division of Computer System Services CIT, NIH Swarms and Bundles: Bioinformatics and Biostatistics on Biowulf.
Information Retrieval in Practice
Building Services for BCI with Taverna Jungkee (Jake) Kim Community Grids Laboratory.
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
1 Gary Wiggins for Geoffrey Fox April 30, 2007 Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana.
OnTimeMeasure Integration with Gush Prasad Calyam, Ph.D. (PI) Tony Zhu (Software Programmer) Alex Berryman (REU Student) GEC10 Selected.
Application Web Service Toolkit Geoffrey Fox, Marlon Pierce, Ozgur Balsoy Indiana University July
BIRT: general info and initial experience Katia Danilova 02/27/2008.
Form printing with SAP Smart Forms Instructor: Dylan Liu
1 SEG3120 Analysis and Design for User Interfaces Flash Anis Zarrad Parallel Simulations and Distributed Systems (PARADISE) Research Laboratory SITE, University.
Web mapping interoperability in practice, a Java approach guided by the OpenGis Web Map Server Interface Specification Pedro Fernández, R. Béjar, M.A.
© ITEDO Software 2001 From 3D CAD to Web catalogs Dieter Weidenbrück.
HyperContent 2.0 Common Solutions Group September 21, 2005 Alex Vigdor, Columbia University.
Computational Science and the School of Informatics at Indiana University IU/HBCU STEM Initiative IUPUI April Geoffrey Fox Computer Science, Informatics,
Handy separation the report template into pages Handy visual separation of the report template into pages is available in Stimulsoft Reports.Net. You.
(C) 2014 Logrus International Visualizing ITS 2.0 Categories for the localization process.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
1 Semantic Research Grid Open Grid Forum Web 2.0 Workshop OGF21, Seattle Washington October Geoffrey Fox, Aurel Cami, Ahmet Fatih Mustacoglu, Ahmet.
1 Overview of Chemical Informatics and Cyberinfrastructure Collaboratory October Geoffrey Fox Computer Science, Informatics, Physics Pervasive.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Application Web Service Toolkit Allow users to quickly add new applications GGF5 Edinburgh Geoffrey Fox, Marlon Pierce, Ozgur Balsoy Indiana University.
CAA Database Overview Sinéad McCaffrey. Metadata ObservatoryExperiment Instrument Mission Dataset File.
Docking and Virtual Screening Using the BMI cluster
Indiana University School of Indiana University ECCR Summary Infrastructure: Cheminformatics web service infrastructure made available as a community resource.
Long Term Preservation of Digital Data Raymond A. Lorie JCDL ‘01 June 24-28, 2001.
Information Retrieval in Practice
XML Related Technologies
Building CICC Web services
Search Engine Architecture
CMS High Level Trigger Configuration Management
LOCO Extract – Transform - Load
MartLoader 0.7 Convenient for distinguishing the 2 versions
Gary Wiggins for Geoffrey Fox
Recap: introduction to e-science
Create your Benner - intro
CICC Project Meeting Introduction to VOTable 1.1
CS6604 Digital Libraries IDEAL Webpages Presented by
CICC Combines Grid Computing with Chemical Informatics
MATLAB – What Is It ? Name is from matrix laboratory Powerful tool for
MATLAB – What Is It ? Name is from matrix laboratory Powerful tool for
MATLAB – What Is It ? Name is from matrix laboratory Powerful tool for
WHAT’S COORDINATE GEOMETRY
JasperReports.
Extracting Recipes from Chemical Academic Papers
University of Washington, Autumn 2018
Ray Tracing on Programmable Graphics Hardware
III. Introduction to Neural Networks And Their Applications - Basics
Use Cases Simple Machine Translation (using Rainbow)
Chemical Informatics and Cyberinfrastructure Collaboratory
Software Engineering and Architecture
Jungkee (Jake) Kim TMD and XML Jungkee (Jake) Kim
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

CICC Chemical Compound Mining Workflows Jungkee (Jake) Kim Community Grids Laboratory

A Workflow for Big Red Demo I PubMed Abstracts OSCAR3 SMILES Extraction Converting the format Text files XML files SMILES Molecular & Quantum Mechanics Converting to pictures Generating HTML script SDF files SDF files POV, JPG files “Big Red” is one of fastest supercomputers Mining chemical compounds found on research paper texts and showing them in 3D graphics 10/06/2006 CICC Project Meeting

A Workflow for Big Red Demo II Final HTML pages

A Workflow for Big Red Demo III PubMed abstracts 555,007 PubMed abstracts of 2005 – 2006 (part) R. Guha 1,000 abstracts per node distributed (Simple parallelism) 511 nodes X 1,000 input abstracts used for the demo OSCAR3 A Cambridge tool which extracts chemical information from text and produces an XML instance highlighting the chemical information Used a revised version for convenient batch processing (some incompatibility to ‘BigRed’ architecture) SMILES extraction Extracting SMILES elements from OSCAR’s XML output files Unique SMILES list within a batch 10/06/2006 CICC Project Meeting

A Workflow for Big Red Demo IV Generating 3D formats K. Gilbert Converting from SMILES to SDF format Molecular Mechanics program: “mengine” (MM engine) No Quantum Mechanics (QM) in the demo Converting 3D formats to pictures J. N. Huffman Persistence of Vision Raytracer (POV-Ray): converting SDF to POV Another program which converts the POV files to JPEG format Generating HTML script Showing those graphic files in an HTML page 10/06/2006 CICC Project Meeting

Bigger Picture for the Workflow NIH PubMed Database OSCAR Text Analysis Cluster Grouping Toxicity Filtering Docking Initial 3D Structure Calculation High Throughput Screening (HTS) Data Organization and Flagging Molecular Mechanics Calculations Quantum Mechanics Calculations NIH PubChem Database Big Red Demo IU’s Varuna Database POV-Ray Parallel Rendering