Download presentation
Presentation is loading. Please wait.
Published byFaith Thomas Modified over 11 years ago
1
1 Overview of Chemical Informatics and Cyberinfrastructure Collaboratory Aug 16 2006 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 gcf@indiana.edu http://www.infomall.org http://www.chembiogrid.org
2
2 Capabilities Local Teams, successful Prototypes and International Collaboration set up in 3 initial major focus areas Chemical Informatics Cyberinfrastructure/Grids with services, workflows and demonstration uses building on success in other applications (LEAD) and showing distributed integration of academic and commercial tools Computational Chemistry Cyberinfrastructure/Grids with simulation, databases and TeraGrid use Education with courses and degrees Review of activities suggest we also formalize work in two further areas Chemical Informatics Research – model applicability Interfacing with the User - bench chemist-friendly portal
3
3 Current Status Web site http://www.chembiogrid.orghttp://www.chembiogrid.org Wiki chosen to support project as a shared editable web space Building Collaboratory involving PubChem – Global Information System accessible anywhere and at any time – enhance PubChem with distributed tools (clustering, simulation, annotation etc.) and data Adopted Taverna as workflow as popular in Bioinformatics but we will evaluate other systems such as GPEL from LEAD Preparing large set of runs on local Big Red 23 Teraflop supercomputer (OSCAR3 CDK Mopac) Initial results discussed at conferences/workshops/papers Gordon Conferences, ACS, SDSC tutorial First new Cheminformatics courses offered Advisory board set up and met Videoconferencing-based meetings with Peter Murray-Rust and group at Cambridge roughly every 2-3 weeks Good or potentially good interactions with NIH DTP, Scripps, Lilly and Michigan ECCR
4
4 CICC Senior Personnel Geoffrey C. Fox Mu-Hyun (Mookie) Baik Dennis B. Gannon Marlon Pierce Beth A. Plale Gary D. Wiggins David J. Wild Yuqing (Melanie) Wu Peter T. Cherbas Mehmet M. Dalkilic Charles H. Davis A. Keith Dunker Kelsey M. Forsythe Kevin E. Gilbert John C. Huffman Malika Mahoui Daniel J. Mindiola Santiago D. Schnell William Scott Craig A. Stewart David R. Williams From Biology, Chemistry, Computer Science, Informatics at IU Bloomington and IUPUI (Indianapolis)
5
5 CICC Advisory Board Alan D. Palkowitz (Eli Lilly) Chris Peterson (Kalypsys) David Spellmeyer (IBM) Dimitris K. Agrafiotis (Johnson & Johnson) Horst Hemmerle (Eli Lilly) James M. Caruthers (Purdue University) Jeremy G. Frey (University of Southampton) Joel Saltz (Ohio State University/University of Maryland/Johns Hopkins University) John M. Barnard (Digital Chemistry) John Reynders (Eli Lilly) Peter Murray-Rust (University of Cambridge) Peter Willett (University of Sheffield) Thompson Doman (Eli Lilly) Val Gillet (University of Sheffield) Industry and Academia Met October 2005 will meet this fall
6
6 CICC Combines Grid Computing with Chemical Informatics CICCCICC Chemical Informatics and Cyberinfrastucture Collaboratory Funded by the National Institutes of Health www.chembiogrid.org Indiana University Department of Chemistry, School of Informatics, and Pervasive Technology Laboratories Science and Cyberinfrastructure. Large Scale Computing Challenges Chemical Informatics is non-traditional area of high performance computing, but many new, challenging problems may be investigated. CICC is an NIH funded project to support chemical informatics needs of High Throughput Cancer Screening Centers. The NIH is creating a data deluge of publicly available data on potential new drugs. CICC supports the NIH mission by combining state of the art chemical informatics techniques with World class high performance computing National-scale computing resources (TeraGrid) Internet-standard web services International activities for service orchestration Open distributed computing infrastructure for scientists world wide NIH PubMed DataBase OSCAR Text Analysis POVRay Parallel Rendering Initial 3D Structure Calculation Toxicity Filtering Cluster Grouping Docking Molecular Mechanics Calculations Quantum Mechanics Calculations IUs Varuna DataBase NIH PubChem DataBase Chemical informatics text analysis programs can process 100,000s of abstracts of online journal articles to extract chemical signatures of potential drugs. OSCAR-mined molecular signatures can be clustered, filtered for toxicity, and docked onto larger proteins. These are classic pleasingly parallel tasks. Top- ranking docked molecules can be further examined for drug potential. Big Red (and the TeraGrid) will also enable us to perform time consuming, multi-stepped Quantum Chemistry calculations on all of PubMed. Results go back to public databases that are freely accessible by the scientific community.
7
CICC Prototype Web Services Molecular weights Molecular formulae Tanimoto similarity 2D Structure diagrams Molecular descriptors 3D structures InChi generation/search CMLRSS Basic cheminformatics Application based services Compare (NIH) Toxicity predictions (ToxTree) Literature extraction (OSCAR3) Clustering (BCI Toolkit) Docking, filtering,... (OpenEye) Varuna simulation Define WSDL interfaces to enable global production of compatible Web services; refine CML Look at Pipeline Pilot Extend Computational Chemistry (Varuna) Services Routine TeraGrid Big Red use Ready to try Prototype Production on OSCAR3 CDK Mopac Develop more training material Link to screening center via Scripps Next steps? Key Ideas Add value to PubChem with additional distributed services and databases Wrapping existing code in web services is not difficult Provide core (CDK) services and exemplars of typical tools Provide access to key databases via a web service interface Provide access to major Compute Grids
8
8 Varuna environment for molecular modeling (Baik, IU) QM Database Researcher Simulation Service FORTRAN Code, Scripts Chemical Concepts Experiments QM/MM Database PubChem, PDB, NCI, etc. ChemBioGrid Reaction DB DB Service Queries, Clustering, Curation, etc. Papers etc. Condor TeraGrid Supercomputers Flocks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.