Biosciences Working Group Update Wilfred W. Li, Ph.D., UCSD, USA Habibah Wahab, Ph.D., USM, Malaysia Hosted by AIST Sapporo, Japan, Oct 17-20, 2011
Transparent access of applications on Avian Flu Grid through middleware CNIC Duckling Portal Konkuk/Kukmin Glyco-M*Grid NBCR CADD
Condor poolTeraGrid/PRAGMA Grid PBS/SGE Clusters Globus Opal Application Services Opal AppMGLTools Kepler Opal WS: Transparent Access Layer for Applications Grid/Cloud Resources CADDVistrailsTaverna Condor CSF4
4 CADD: Opal Web Services for Biomedical Applications Ren et al, NAR 2010, Web Server Issue Modules supporting MD simulation and analysis, Virtual Screening, Docking, Visualization Project management under development
Opal Plugins for Popular Workflow Software
Virtualization for Bioscience Applications
MURPA 2010, Lin Wei
Integration: CNIC Duckling Portal and Opal 2 Client PRIME 2010, Brian Zhang
OPAL NBCR DUCKLING CNIC User Management Opal Web Service Client Application Metadata Submit Job (Service URL) Output URL Metadata Cache Job History Application UIs Opal Service List Job Result Web Service
Wendy Fong, PRIME 2010, CNIC
Social Networks and Collaborative Environment Social Network SiteNumber of UsersFeaturesAPI Examples Google170 million (Gmail)Google Integrated Suite of Tools Google Apps Engine LinkedIn65 millionProfessionalHuddle/Zoho Office Online Twitter100 millionShort MMS/SMSTwitPic Google Wave100,000 X 7?Upload any fileGoogle Wave Robot Facebook500 million+Social networkFacebook Apps Are these too big to fail? Utility Computing finally?
TEXT MINING SYSTEM InSilicoCell System architecture Sentence selector Relation extractor Information element recognizer Data handler MetaMap Client Tool NCBI data downloader Network Generator Visualizer Information handler KISTI, Seok Jong Yu
BioKnowledge Viewer GUI
University of Indonesia Working Group Database o Prototype of Medicinal Plants Database and Three Dimensional Structure of the Chemical Compounds from Medicinal Plants in Indonesia, Medicinal Plants Database and Three Dimensional Structure of the Chemical Compounds from Medicinal Plants in Indonesia, Int J Comp Sci Issue, 2011, 8(5): Member : Prof. Heru Suhartanto, Ph.D (High Performance and Numerical Computing) Dr. Arry Yanuar (Pharmaceutical Chemistry) Alhadi Bustamam, Ph.D. (GPU Computing) Dr. Abdul Mun'im (Phytochemistry)
PRAGMA Institute and Related Training Activities
2 – 4 March 2010 PRAGMA 18, San Diego16 Nornisah Mohamed, USM
Hierarchical Map Reduce (HMR) Yuan Luo, IU Application: AutoDock Virtual Screening
Meeting the New Challenges Virtualization – What does it mean to us? – Fault Tolerance, Redundancy, Location based Access to Services Production environment – Where is it? What form should it take? – the good old clusters, Services, EC2, VM replication – Changing infrastructure and rise of social cloud networks for routine file sharing, google doc, dropbox, etc. Collaboration – How to stay in touch better? – PRIME, MURPA, PRAGMA Institute, NCHC, CADD Workshop, USM, NBCR Summer Institute – Shared Environment for Data, Services, and Interaction
Looking ahead NBCR/SDSC and KISTI: co-development of Opal Plugin for Bioworks; scholastic exchange UCSD and HKU: Use CADD pipeline in Alzheimer’s Disease Research KISTI and HKU: Use Bioworks in Alzheimer’s Disease Research UCSD and University of Indonesia: Security requirement for proprietary compound library and Opal services
Looking Ahead PRAGMA Resources: – Redundant VM based Application Services Distributed Geographically for Location Based Access – Data resource, interoperability of cloud resources – Opal App for Biomedical Services – Leveraging Google App Engine and Social Network Infrastructure – Duckling Portal with Opal 2.4 support
PRAGMA 21 Activities Day 1 – WG Breakout Session 1: 13:30 – 15:00 Improved 3D structure modeling workflow, Jason Haga, UCSD CADD pipeline, Wilfred Li, NBCR/UCSD insilicoCell, Seok Jung Yu, KISTI – Demo: Kevin Dong (CNIC), 15:20, today on Opal Duckling Portal. Day 2 – WG Breakout Session 2: 14:45-15:45 – WG update, 16:15 – 16:45
Day 1 Breakout Session Summary Kevin Dong, CNIC – Opal Duckling Portal User notification of job completion, and job data deletion warning Data cloud access, how to reduce the data management and sharing overhead? Wilfred Li, NBCR/UCSD – CADD pipeline Service maintenance, versioning, and virtualization Redundancy in application service providers, hoping Resource WG make good progress with VM provisioing
Day 1 Summary Jason Haga, UCSD – Opal-OP and Modeller for homology modeling Student deployment versus stable service provider via PRAGMA Data management, needs long term storage until no longer necessary Different virtual cluster deployment method, NCHC, Rocks, Osaka U, JLU, … Hsin-Yen Chen, ASGC – Web based portal for virtual screening and analysis based upon gLite – Expanded resource usage through BOINC – Virtualized computing environment under consideration
Day 1 Summary Seok Jong Yu, KISTI – InsilicoCell, text mining tool for interaction pathway Worked with HKU on Alzheimer’s Disease Experimental validation through case studies, with Korean Ginseng Corp. Explore web service API’s as cloud service providers Backend is KISTI cluster system Tony Cheung, University of Hong Kong (HKU) – HKU Computer Center working with SDSC/UCSD to deploy Opal services – Gaussian application, MPI BLAST
Application Services Explore VM based service replication and dynamic resource expansion – Protein Electrostatic Calculations PDB2PQR, APBS – Virtual Screening and Computer Aided Drug Discovery AutoDock, Vina, – MEME and other Bioinformatics applications Homology modeling with Modeller – Cheminformatics applications
Data Services Data service that is compatible with VM based services. – Without data storage compatible with anticipated data size created by VM based services, VM services are not useful – Without good global network connection, most services would be location based to maximize performance – Data sharing is transient, often require ad hoc vs persistent high bandwidth network infrastructure. Nextgen sequencing actually create more persistent needs for large amount of data sharing, and data security
Service Scalability GPU cluster deployment for speedup of specific types of applications – Porting applications require domain knowledge Workflow systems that can select application services wisely based upon location, and other quality of service information – Vision, Bioworks, Ease of sharing, and positive user experience is a must
Collaboration, Education, and Training Engage local researchers for collaboration – HKU and PRAGMA 20, great interaction between HKU researchers and Biosciences WG. Thanks to Dr. Kwan and his dedicated team PRAGMA Institute, NCHC, aka, SEAIP – Fang Pang Lin, Center of Excellence of Pacific Rim in Cyber Education and Research Collaboration CADD Workshop, USM – Habibah Wahab
Others NBCR Summer Institute, UCSD – Computer Aided Drug Discovery – Scalable Computing PRIME, UCSD – UCSD to Pacific Rim countries MURPA, Monash University – MU students to US
Benchmarks for Success Joint Publications – Co-authorship – Use cases of service, software and infrastructure, aka, acknowledgment Co-location of Workshops – Infectious Disease Research, KISTI, PRAGMA 16, 3/09 Attract target audience to specialized workshop as opposed to more IT oriented PRAGMA workshop – GEO Science Workshop, PRAGMA17, 20, 21
Benchmarks – PRAGMA Institute on Virtualization and Implementation, PRAGMA 18 Unfortunately, older websites no longer exist, Duckling portal is a really good thing, starting with PRAGMA 18.
Benchmarks Technology adopted and improved, Biosciences WG – Duckling portal – Gfarm – CSF4 – Network? – TDW, SAGE – Opal 2, Opal OP – Rocks and virtualization – gLite, globus? – Data turbine – Social networks? – Workflow systems