Presentation is loading. Please wait.

Presentation is loading. Please wait.

GridChem A Computational Chemistry Cyber-infrastructure Using Web services Sanibel Symposium 23 Feb 07 Sudhakar Pamidighantam NCSA, University of Illinois.

Similar presentations


Presentation on theme: "GridChem A Computational Chemistry Cyber-infrastructure Using Web services Sanibel Symposium 23 Feb 07 Sudhakar Pamidighantam NCSA, University of Illinois."— Presentation transcript:

1 GridChem A Computational Chemistry Cyber-infrastructure Using Web services Sanibel Symposium 23 Feb 07 Sudhakar Pamidighantam NCSA, University of Illinois at Urbana-Champaign sudhakar@ncsa.edu

2 Acknowledgements

3 Outline Historical Background Grid Chemistry Current Status Web Services Usage Brief Demo Future

4 Motivation Software - Reasonably Mature and easy to use to address chemists questions of interest Community of Users - Need and capable of using the software Some are non traditional computational chemists Resources - Various in capacity and capability

5 Background Qauntum Chemistry Remote Job Monitor ( Quantum Chemistry Workbench) 1998, NCSA Chemviz 1999-2001, NSF Technologies Web Based Client Server Models Visual Interfaces Distributed computing

6 GridChem NCSA Alliance was commissioned 1998 Diverse HPC systems deployed both at NCSA and Alliance Partner Sites Batch schedulers different at sites Policies favored different classes and modes of use at different sites/HPC systems

7 Extended TeraGrid Facility www.teragrid.org

8 Grid and Gridlock Alliance lead to Physical Grid Grid lead to TeraGrid Homogenous Grid was planned but it was difficult to keep it homogenous Things got more complicated and we have heterogeneous grids now! Interoperability and Standards and Openness Are Critical

9 Current Grid Status Grid Hardware Middleware Scientific Applications

10 User Community Chemistry and Computational Biology User Base Sep 03 – Oct 04 NRAC AAB Small Allocations ------------------------------------------------------------- #PIs 26 23 64 #SUs 5,953,100 1,374,100 640,000

11

12 User Issues New systems meant learning new commands Porting Codes Learning new job submissions and monitoring protocols New proposals for time Computational modeling became more popular and users increased Batch queues are longer / waiting increased Find resources where to compute - probably multiple distributed sites Multiple proposals/allocations/logins Authentication and Data Security Data management

13 Computational Chemistry Grid Integrated Cyber Infrastructure for Computational Chemistry Integrates Applications, Middleware, HPC resources, Scheduling and Data management Allocations, User Services and Training

14 Resources System (Site)Procs Avail Total CPU Hours/Year Status Intel Cluster (OSC)36315,000 SMP and Cluster nodes HP Integrity Superdome (UKy) 33290,000 TB Replaced with an SMP/ Cluster nodes IA32 Linux Cluster (NCSA) 64560,000 Intel Cluster (LSU)10241,000,000 IBM Power4 (TACC)16140,000 Teragrid (Multiple Institutions) 250,000New Allocation Expected

15 Other Resources Extant HPC resources at various Supercomputer Centers (Interoperable) Optionally Other Grids and Hubs/local/personal resources These may require existing allocations/Authorization

16

17 Grid Middleware Proxy Server GridChem System user Portal Client Grid Services Grid applicationapplication Mass Storage http:// www.nsf.gov/awardsearch/showAward.do?AwardNumber=0438312

18 Applications GridChem supports some apps already –Gaussian 98/03, GAMESS, NWChem, Molpro, QMCPack, Amber Schedule of integration of additional software –ACES-2 –Crystal –Q-Chem –Wein2K –MCCCS Towhee –More …..

19 Gridchem Middleware Web Services Oriented

20 WS XML is used to tag the data, SOAP is used to transfer the data, WSDL is used for describing the services available and UDDI is used for listing what services are available. Web Services is different from Web Page Systems or Web Servers: There is no GUI Web Services Share business logic, data & processes through API with each other (not with user) Web Services describe Standard way of interacting with “web based” applications A client program connecting to a web service can read the WSDL to determine what functions are available on the server. Any special datatypes used are embedded in the WSDLdatatypes file in the form of XML Schema. Universal Description, Discovery, and Integration. WSRF Standards Compliant.

21 Client  Objects  Database Interaction WS Resources DTOClient ObjectsHibernate Databasehb.xml DTO (Data Transfer Object) Serialize transfer through XML DAO (Data Access Object) How to get the DB objects hb.xml (Hibernate Data Map) describes obj/column data mapping Business Model DAO

22 Database Table Relationships UsersProjectsResources UserProjectResource SoftwareResources ComputeResources NetworkResoruces StorageResources Resources resoruceID Type hostName IPAddress siteID userID projectID resourceID loginName SUsLocalUserUsed Jobs jobID jobName userID projID softID cost UsersResources

23 Computational Chemistry Resource

24 GMS_WS Use Cases Authentication Job Submission Resource Monitoring File Retrieval http://www.gridchem.org:8668/space/GMS/usecase

25 GMS_WS Authentication WSDL (Web Service Definition Language) is a language for describing how to interface with XML-based services. It describes network services as a pair of endpoints operating on messages with either document-oriented or procedure-oriented information. The service interface is called the port type WSDL FILE: <definitions name="MathService" targetNamespace="http://www.globus.org/namespaces/examples/core/MathService_instance" xmlns="http://schemas.xmlsoap.org/wsdl/" … http://www.gridchem.org:8668/space/GMS/usecase Contact GMS Creates Session, Session RP and EPR Sends EPR Login Request (username:passwd) Validates, Loads UserProjects Sends acknowledgement Retrieve UserProjects (GetResourceProperty port Type PT) GC ClientGMS

26 GMS_WS Authentication http://www.gridchem.org:8668/space/GMS/usecase Selects project LoadVO port type (w. MAC address) Verifies user/project/MACaddr Load UserResources RP Retrieve UserResources [as userVO/ Profile] (GetResourceProperty port Type PT) GC ClientGMS Validates, Loads UserProjects Sends acknowledgement

27 GMS_WS Job Submission Create Job object PredictJobStartTime PT + JobDTO JobStart Prediction RP PT = portType RP = Resource Properties DTO = Data Transfer Object Completion: Email from batch system to GMS server cron@GMS  DB Submission CoGKit GAT “gsi-ssh” If decision OK, SubmitJob PT + JobDTO Create Job object API—Submit Store Job Object Send Acknowledgement Need to check to make sure allocation-time is available. GC ClientGMS

28 GMS_WS Monitoring Parse XML, Display PT = portType RP = Resource Properties DTO = Data Transfer Object DB = Data Base cron@GMS server cron@HPC Servers Job Launcher Notifications VO Admin email parses email  DB (status + cost) Request for Job, Resource Status Alloc. Balance UserResource RP Updated from DB GC ClientGMSResources/Kits/DB Send info

29 GMS_WS File Retrieval GetResourceProperty PT FileDTO(?) LoadFile PT (project folder+job) Validates project folder owned by user. Send new listing PT = portType RP = Resource Properties DTO = Data Transfer Object MSS = Mass Storage System Job Completion: Send Output to MSS LoadFile PT MSS query UserFiles RP + FileDTO object Retrieve Root Dir. Listing on MSS with CoGKit or GAT or “gsi-ssh” Should whole directory be evaluated (may be large)— why not just those owned by user? API file request Store locally Create FileDTO Load into UserData RP RetrieveFiles PT (+file rel.path) Retrieve file: CoGKit or GAT or “gsi-ssh” GetResourceProperty PT GC ClientGMSResources/Kits/DB

30 GMS_WS File Retrieval PT = portType RP = Resource Properties DTO = Data Transfer Object MSS = Mass Storage System Create FileDTO (?) Load into UserData RP Should whole directory be evaluated (may be large)— why not just those owned by user? RetrieveJobOutput PT (+JobDTO) Job Record from DB. Running: from Resource Complete: from MSS Retrieve file: CoGKit or GAT or “gsiftp” GetResourceProperty PT GC ClientGMSResources/Kits/DB

31 Web Services WSRF (Web Services Resource Framework) Compliant WSRF Specifications: WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) %ps -aux | grep ws /usr/java/jdk1.5.0_05/bin/java \ -Dlog4j.configuration=container-log4j.properties \ -DGLOBUS_LOCATION=/usr/local/globus \ -Djava.endorsed.dirs=/usr/local/globus/endorsed \ -DGLOBUS_HOSTNAME=derrick.tacc.utexas.edu \ -DGLOBUS_TCP_PORT_RANGE=62500,64500 \ -Djava.security.egd=/dev/urandom \ -classpath /usr/local/globus/lib/bootstrap.jar: /usr/local/globus/lib/cog-url.jar: /usr/local/globus/lib/axis-url.jar org.globus.bootstrap.Bootstrap org.globus.wsrf.container.ServiceContainer -nosec Logging Configuration Where to find Globus Where to get random seed for encryption key generation Classpath (required jars)

32 Software Organization CVS for GridChem

33 Package: org.gridchem.service.gms GMS_WS

34 + Should these each be a separate package?

35 model dto credential job notification filefile.task job.task user exceptions resource persistence synch query test util dao gpir crypt enumerators gat proxy GMS_WS client audit gms Classes for WSRF service implementation (PT) Cmd line tests to mimic client requests Data Access Obj – queries DB via persistent classes (hibernate) Data Transfer Obj – (job,File,Hardware,Software,User) XML How to handle errors (exceptions) CCG Service business mode (how to interact) Contains user’s credentials 4 job sub. file browsing,… “ Oversees correct” handling of user data (get/putfile). Define Job & util & enumerations (SubmitTask, KillTask,…) CCGResource&Util, Synched by GPIR, abstract classes NetworkRes., ComputeRes., SoftwareRes., StorageRes., VisualizationRes. User (has attributes – Preference/Address) DB operations (CRUD), OR Maps, pool mgmt,DB session, Classes that communicate with other web services Periodically update DB with GPIR info (GPIR calls) JUnit service test (gms.properties): authen. VO retrieval, Res.Query,Synch, Job Mgmt, File Mgmt, Notification Contains utility and singleton classes for the service. Encryption of login password Mapping from GMS_WS enumeration classes  DB GAT util classes: GATContext & GAT Preferences generation Classes deal with CoGKit configuration. Autonomous notification via email, IM, textmesg.

36 GMS_WS external jars Testing For XML Parsing “Java” Document Object Model –Lightweight –Reading/Writing XML Docs –Complements SAX (parser) & DOM –Uses Collections**

37 Authentication

38 Resource Status

39 Job Editor

40 Job Submission

41 Job Monitoring

42 Gradient Monitoring

43 Energy Monitoring

44 Post Processing

45 Visualization Molecular Visualization Electronic Properties Spectra Vibrational Modes

46 Molecular Visualization Better molecule representations (Ball and Stick/VDW/MS) In Nanocad Molecular Editor Third party visualizer integration Chime/VMD Export Possibilities to others interfaces Deliver standard file formats (XML,SDF,MSF,Smiles etc…)

47 Eigen Function Visualization Molecular Orbital/Fragment Orbital MO Density Visualization MO Density Properties Other functions Radial distribution functions

48 Some example Visuals Arginine Gamess/6-31G* Total electronic density 2D - Slices

49 Electron Density in 3D Interactive (VRML)

50 Orbital 2D Displays N2 6-31g* Gamess

51 Orbital 3D VRML

52 Spectra IR/Raman Vibrotational Spectra UV Visible Spectra Spectra to Normal Modes Spectra to Orbitals

53 GridChem Use Allocation Community and External Registration Consulting/User Services Ticket tracking, Allocation Management Documentation Training and Outreach FAQ Extraction, Tutorials, Dissemination

54 Users and Usage 170 Users Include Academic PIs, two graduate classes And about 15 training users NCSA 57000 SUs + A 7 node dedicated system UKy around 106766 SUs OSC 13,820 SUs + A 14 node dedicated system Usage at LSU and TACC as well More than a 335000 CPU Wallhours since Jan 06.

55 Science Enabled Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D.; Dmitrenko, O.; Estévez, C. M. J. Am. Chem. Soc. 2005, 127, 3140-3155. The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488. The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598.

56 Science Enabled Azide Reactions for Controlling Clean Silicon Surface Chemistry: Benzylazide on Si(100)-2 1 Semyon Bocharov, Olga Dmitrenko, Lucila P. Mendez De Leo, and Andrew V. Teplyakov* Department of Chemistry and Biochemistry, UniVersity of Delaware, Newark, Delaware 19716 Received April 13, 2006; E-mail: andrewt@udel.edu http://pubs.acs.org.proxy2.library.uiuc.edu/cgi- bin/asap.cgi/jacsat/asap/pdf/ja0623663.pdf [May require ACS access] http://pubs.acs.org.proxy2.library.uiuc.edu/cgi- bin/asap.cgi/jacsat/asap/pdf/ja0623663.pdf

57 Third Year Plans Post Processing New Application Support Expansion of Resources Extension Plan

58 Acknowledgments Rion Dooley, TACC Middleware Infrastructure Stelios Kyriacou, OSC Middleware Scripts Chona Guiang, TACC Databases and Applications Kent Milfeld, TACC Database Integration Kailash Kotwani, NCSA, Applications and Middleware Scott Brozell, OSC, Applications and Testing Michael Sheetz, UKy, Application Interfaces Vikram Gazula, UKy, Server Administration Tom Roney, NCSA, Server and Database Maintaienance


Download ppt "GridChem A Computational Chemistry Cyber-infrastructure Using Web services Sanibel Symposium 23 Feb 07 Sudhakar Pamidighantam NCSA, University of Illinois."

Similar presentations


Ads by Google