KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.

Slides:



Advertisements
Similar presentations
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Advertisements

University of Southampton Electronics and Computer Science M-grid: Using Ubiquitous Web Technologies to create a Computational Grid Robert John Walters.
1 DB2 Access Recording Services Auditing DB2 on z/OS with “DBARS” A product developed by Software Product Research.
A Computation Management Agent for Multi-Institutional Grids
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Task Scheduling and Distribution System Saeed Mahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology
Multi-criteria infrastructure for location-based applications Shortly known as: Localization Platform Ronen Abraham Ido Cohen Yuval Efrati Tomer Sole'
Performed by:Gidi Getter Svetlana Klinovsky Supervised by:Viktor Kulikov 08/03/2009.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
A tool to enable CMS Distributed Analysis
SERVICE BROKER. SQL Server Service Broker SQL Server Service Broker provides the SQL Server Database Engine native support for messaging and queuing applications.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Overview SAP Basis Functions. SAP Technical Overview Learning Objectives What the Basis system is How does SAP handle a transaction request Differentiating.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
WP6: Grid Authorization Service Review meeting in Berlin, March 8 th 2004 Marcin Adamski Michał Chmielewski Sergiusz Fonrobert Jarek Nabrzyski Tomasz Nowocień.
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
1 port BOSS on Wenjing Wu (IHEP-CC)
FKPPL workshop May 2012 BUI The Quang Prof. Vincent Breton Prof. Doman Kim Prof. NGUYEN Hong Quang Prof. PHAM Quoc Long Grid enabled in silico drug discovery.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
Grid infrastructure analysis with a simple flow model Andrey Demichev, Alexander Kryukov, Lev Shamardin, Grigory Shpiz Scobeltsyn Institute of Nuclear.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
1 Chapter Overview Performing Configuration Tasks Setting Up Additional Features Performing Maintenance Tasks.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Bulk Data Movement: Components and Architectural Diagram Alex Sim Arie Shoshani LBNL April 2009.
Enabling Grids for E-sciencE EGEE-III INFSO-RI I. AMGA Overview What is AMGA Metadata Catalogue of EGEE’s gLite 3.1 Middleware Main Feature of.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
Server to Server Communication Redis as an enabler Orion Free
George Tsouloupas University of Cyprus Task 2.3 GridBench ● 1 st Year Targets ● Background ● Prototype ● Problems and Issues ● What's Next.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks INFSO-RI Enabling Grids for E-sciencE.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
INFSO-RI Enabling Grids for E-sciencE Use Case of gLite Services Utilization. Multiple Ligand Trajectory Docking Study Jan Kmuníček.
Overview Background: the user’s skills and knowledge Purpose: what the user wanted to do Work: what the user did Impression: what the user think of Ganga.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
UCI Large-Scale Collection of Application Usage Data to Inform Software Development David M. Hilbert David F. Redmiles Information and Computer Science.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
The GridPP DIRAC project DIRAC for non-LHC communities.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Solving inter grid interoperability issues: example of WISDOM and molecular dynamics Jean Salzemann, LPC Clermont-Ferrand, CNRS/IN2P3 Morris Riedel, Jülich.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
Enabling Grids for E-sciencE INFN Workshop – May 7-11 Rimini 1 Grid Accounting Status at INFN Riccardo Brunetti INFN-TORINO.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.
BDII Performance Tests
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
Nicolas Jacq LPC, IN2P3/CNRS, France
University of Technology
DGAS Today and tomorrow
Building Windows Store Apps with Windows Azure Mobile Services
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division

Outline  DrugScreener-G:  Toward Integrated Environment for Grid- enabled Large-scale Virtual Screening  Data challenge on Diabetes using the newly developed performance- enhanced WISDOM Production environment  New Developments in AMGA

DrugScreener-G: Toward Integrated environment for Grid-enabled large-scale Virtual Screening

DrugScreener-G  Objective  Providing a user-friendly integrated environment for Grid-based large-scale virtual screening for users without much knowledge of Grid computing to exploit full power of Grid computing infrastructure for drug discovery  Target users: Bioinformaticians, Biologists, Drug Chemists

Virtual Screening in DrugScreener-G

Plug-in Architecture in DrugScreener-G Plug-ins on client’s side Plug-ins on server’s side

Support for multiple platforms

Data Challenge on Diabetes

Data Challenge against diabetes type 2  Data challenge to Human pancreatic amylase inhibitor (1u2y), a target protein of diabetes type 2, with chemical compounds  Achieved significant improvement in efficiency and throughput WISDOM-IIDIANEKISTI Total number of dockings156,407,400308,585308,310 Estimated duration on 1 CPU413 years16.7 years39.0 years Duration of the experiment76 days30 days2.4 days Cumulative number of Grid jobs77,5042,580103,583 Maximum number of concurrent CPUs ,370 Number of used Computing Elements Crunching Factor Distribution Efficiency39%84%81%

Performance Enhancement in the WISDOM Production Environment

Evolution of WISDOM Production Environment  Data/Control Flows in one slide WN SE CERB UI Production Environment Retrieve Target Submit Jobs Monitor Job Status Update Job Status Get Task Retrieve Software, Compound Send Heartbeat signal Update Job Status Store Result AMGA

The exploitation of AMGA for storing ligands and docking results  Why is AMGA used for the input/output data storing purpose?  AMGA performs way much better than SE in terms of throughput and latency Docking Throughput Time taken to retrieve a ligandTime taken to store a docking result

The use of AMGA for task distribution purpose in the pull model of WISDOM PE  A new AMGA operation is designed to allow jobs to fetch a task from the AMGA server in a single operation, rather than calling  the AMGA source code modified and a new operation “updateAttr_single” added  The throughput of task retrieval using the new operation is 70 times higher than that of using the existing AMGA operations

Enhanced Monitoring & Bookkeeping Workload status and Job throughput monitoring Status summery and simulation throughput monitoring Job distribution and execution status with CE info

Additional Features relevant to Performance Improvement of WISDOM PE  Flexible fault tolerance  Can recover a previous instance using data on AMGA  Can increase/decrease the number of agents of a running previous instance  Improved Job distribution efficiency  Dynamic management of the list of available RB and CE  While the initial job submission keeps going, we can check the status of submitted jobs and resubmit failed jobs  Direct collecting the statistical data of sites from WN  Its own ranking based on previous statistical data  Timeout feature (submission timeout, heartbeat timeout)

New Developments in AMGA

AMGA 2.0  Currently preparing AMGA 2.0 release  To be released later this year  Stability of WS-DAIR server needs to be improved before release  Feature-complete 1.9 technology preview available  Support for the import of existing relational tables  WS-DAIR frontend  Native SQL support  Multi-threaded DB backend with connection pooling

WS-DAIR Support  WS-DAIR  OGF standard for access to relational DB's on the Grid  Background  Enable interoperability to facilitates sharing data  AMGA uses its own protocol and message patterns  Integration of WS-DAIR in AMGA will make AMGA a relational data source in a WS-based environment!  AMGA-DAIR Integration  Implementation of direct access and indirect access done.  Data returned in Sun's WebRowSet specification.  All queries use SQL.  Clients written in C++ (gSOAP)

SQL Queries  AMGA allows queries now in native SQL  Support for entry level SQL 92 done  Some 92 intermediate level supported  SELECT, UPDATE, INSERT, DELETE  All queries are subject to AMGA access restrictions  All queries are parsed by AMGA, AMGA “understands” security implications  Posix ACLs for tables/entries  SQL 92 query is translated into backend DB dialect

Multithreaded & DB Connection Pool Backend Support  AMGA 1.3 used one process per client connection  Processes allow control of misbehaving clients  But processes cannot share DB connections  Limits # concurrent clients Max # Requests / sec # AMGA Servers AMGA 1.9: Multiple threads sharing a DB connection encapsulated in multiple processes Threads allow better resource usage on DB WISDOM measured 1500 Queries/s, more than the required 5million Queries / h peak needed for data challanges

Thank you