1 / 107

Production Cyberenvironment for a A Computational Chemistry Grid PRAGMA13, NCSA 26 Sep 07

Production Cyberenvironment for a A Computational Chemistry Grid PRAGMA13, NCSA 26 Sep 07. Sudhakar Pamidighantam NCSA, University of Illinois at Urbana-Champaign sudhakar@ncsa.edu. Acknowledgements. Outline. Historical Background Grid Computational Chemistry

lester
Download Presentation

Production Cyberenvironment for a A Computational Chemistry Grid PRAGMA13, NCSA 26 Sep 07

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Production Cyberenvironment for a A Computational Chemistry Grid PRAGMA13, NCSA 26 Sep 07 Sudhakar Pamidighantam NCSA, University of Illinois at Urbana-Champaign sudhakar@ncsa.edu National Center for Supercomputing Applications

  2. Acknowledgements National Center for Supercomputing Applications

  3. Outline • Historical Background Grid Computational Chemistry • Production Environments • Current Status Web Services • Usage (Grid and Science Achievements) • Brief Demo • Future National Center for Supercomputing Applications

  4. Motivation Software - Reasonably Mature and easy to use to address chemists questions of interest Community of Users - Need and capable of using the software Some are non traditional computational chemists Resources - Various in capacity and capability National Center for Supercomputing Applications

  5. Background Qauntum Chemistry Remote Job Monitor ( Quantum Chemistry Workbench) 1998, NCSA Chemviz 1999-2001, NSF (USA) http://chemviz.ncsa.uiuc.edu Technologies Web Based Client Server Models Visual Interfaces Distributed computing (Condor) National Center for Supercomputing Applications

  6. GridChem NCSA Alliance was commissioned 1998 Diverse HPC systems deployed both at NCSA and Alliance Partner Sites Batch schedulers different at sites Policies favored different classes and modes of use at different sites/HPC systems National Center for Supercomputing Applications

  7. Extended TeraGrid Facility www.teragrid.org National Center for Supercomputing Applications

  8. NSF Petascale Road Map • Track I Scheme Multi petaflop single site system to be deployed by 2010 Several Consortia Competing (Now under review) • Track 2 Sub petaflop systems Several to be deployed until Track 1 is online First one will be at TACC ( 450 TFlops) Available Fall 2007 ( 50 000 Processors/Cores) NCSA is deploying a 110 TFlops in April 2007 (10000 Processors/cores) Second subpetaflops systems being reviewed National Center for Supercomputing Applications

  9. Grid and Gridlock Alliance lead to Physical Grid Grid lead to TeraGrid Homogenous Grid with predefined fixed software and system stack was planned (Teragrid) but it was difficult to keep it homogenous Local preferences and diversity leads to heterogeneous grids now! (Operating Systems, Schedulers, Policies, Software and Services) Openness and standards that lead interoperability are criticalfor successful services National Center for Supercomputing Applications

  10. Current Grid Status Interfaces Grid Hardware Scientific Applications Middleware National Center for Supercomputing Applications

  11. User Community Chemistry and Computational Biology User Base Sep 03 – Oct 04 NRAC AAB Small Allocations ------------------------------------------------------------- #PIs 26 23 64 #SUs 5,953,100 1,374,100 640,000 National Center for Supercomputing Applications

  12. National Center for Supercomputing Applications

  13. Some User Issues Addressed by the new Services • New systems meant learning new commands • Porting Codes • Learning new job submissions and monitoring protocols • New proposals for time (time for new proposals) • Computational modeling became more popular and number of users increased (User Management) • Batch queues are longer / waiting increased • Finding resources where to compute complicated - probably multiple distributed sites • Multiple proposals/allocations/logins • Authentication and Data Security • Data management National Center for Supercomputing Applications

  14. Computational Chemistry Grid This is a Virtual Organization Integrated Cyber Infrastructure for Computational Chemistry Integrates Applications, Middleware, HPC resources, Scheduling and Data management Allocations, User services and Training National Center for Supercomputing Applications

  15. Resources The initial Acesss Grid Testbed Nodes (38) and Condor SGI resources (NCSA, 512 nodes) have been retired this year. National Center for Supercomputing Applications

  16. Other Resources Extant HPC resources at various Supercomputer Centers (Interoperable) Optionally Other Grids and Hubs/local/personal resources These may require existing allocations/Authorization National Center for Supercomputing Applications

  17. National Center for Supercomputing Applications

  18. GridChem System user user user user user Portal Client application application Grid Middleware Proxy Server Grid Services MassStorage Grid http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0438312 National Center for Supercomputing Applications

  19. Applications • GridChem supports some apps already • Gaussian, GAMESS, NWChem, Molpro, ADF,QMCPack, Amber • Schedule of integration of additional software • ACES-3 • Crystal • Q-Chem • Wein2K • MCCCS Towhee • Others... National Center for Supercomputing Applications

  20. GridChem User ServicesAllocation Requesthttps://www.gridchem.org/allocations/comm_form.php National Center for Supercomputing Applications

  21. GridChem User ServicesConsulting Ticketing SystemUser View National Center for Supercomputing Applications

  22. GridChem User ServicesConsulting Ticketing Systemhttps://www.gridchem.org/consult/ Consultants View National Center for Supercomputing Applications

  23. Gridchem Middleware Service (GMS) National Center for Supercomputing Applications

  24. GrdiChem Web ServicesQuick Primer Web Services is different from Web Page Systems or Web Servers: There is no GUI Web Services Share business logic, data & processes through APIs with each other (not with user) Web Services describe Standard way of interacting with “web based” applications XML is used to tag the data, SOAP is used to transfer the data, WSDL is used for describing the services available and UDDI is used for listing what services are available. A client program connecting to a web service can read the WSDL to determine what functions are available on the server. Any special datatypes used are embedded in the WSDL file in the form of XML Schema. Universal Description, Discovery, and Integration. WSRF Standards Compliant. National Center for Supercomputing Applications

  25. Client Business Model DTO DTO (Data Transfer Object) Serialize transfer through XML DAO (Data Access Object) How to get the DB objects hb.xml (Hibernate Data Map) describes obj/column data mapping DAO WS Resources Objects Hibernate hb.xml Database GridChem Web ServicesClient Objects  Database Interaction National Center for Supercomputing Applications

  26. Users Projects Resources UserProjectResource Users Resources SoftwareResources NetworkResources Resources ComputeResources StorageResources resoruceID Type hostName IPAddress siteID GridChem Data Models Jobs userID projectID resourceID loginName SUsLocalUserUsed jobID jobName userID projID softID cost National Center for Supercomputing Applications

  27. Computational Chemistry Resource National Center for Supercomputing Applications

  28. GMS_WS Use Cases http://www.gridchem.org:8668/space/GMS/usecase • Authentication • Job Submission • Resource Monitoring • Job Monitoring • File Retrieval • … National Center for Supercomputing Applications

  29. GridChem Web Services Operations • GetResourceProperty • SetTerminationTime • Destroy • Create • Login • LoadVO • RetrieveFiles • LoadFiles • DeleteFiles • LoadParentFiles • RefreshFiles • MakeDirectory • SubmitJob • SubmitMultipleJobs • PredictJobStartTime • KillJob • HideJob • UnhideJob • UnhideJobs • DeleteJob • FindJobs • GetJobStatus • RetrieveJobOutput • RetrieveNextDataBlock • StopFileAction • GetUserPreferences • PutUserPreferences National Center for Supercomputing Applications

  30. Contact GMS Creates Session, Session RP and EPR Sends EPR ( Like a Cookie, but more than that) Login Request (username:passwd) Validates, Loads UserProjects Sends acknowledgement Retrieve UserProjects (GetResourceProperty Port Type [PT]) GMS_WS Authentication http://www.gridchem.org:8668/space/GMS/usecase GridChem Client GMS • WSDL (Web Service Definition Language) is a language for describing how to interface with XML-based services. It describes network services as a pair of endpoints operating on messages with either document-oriented or procedure-oriented information. • The service interface is called the port type • WSDL FILE: <?xml version="1.0" encoding="UTF-8"?> <definitionsname=“GMS"targetNamespace=http://www.gridchem.org/gms " xmlns="http://schemas.xmlsoap.org/wsdl/" … National Center for Supercomputing Applications

  31. Selects projectLoadVO port type(w. MAC address) Verifies user/project/MACaddr Load UserResources RP Sends acknowledgement Validates, Loads UserProjects Sends acknowledgement Retrieve UserResources[as userVO/ Profile] (GetResourceProperty port Type PT) GMS_WS Authentication http://www.gridchem.org:8668/space/GMS/usecase GridChem Client GMS National Center for Supercomputing Applications

  32. Create Job object PredictJobStartTime PT + JobDTO JobStart Prediction RP If decision OK, SubmitJob PT + JobDTO Create Job objectAPI—Submit Store Job Object Submission CoGKit GAT “gsi-ssh” Send Acknowledgement Completion: Email from batch system to GMS server cron@GMS DB GMS_WS Job Submission GC Client GMS PT = portType RP = Resource Properties DTO = Data Transfer Object Need to check to make sure allocation-time is available. National Center for Supercomputing Applications

  33. GC Client GMS Resources/Kits/DB Request for Job, Resource Status Alloc. Balance UserResource RP Updated from DB Send info GMS_WS Monitoring cron@GMS server cron@HPC Servers Job Launcher Notifications VO Admin email parses email  DB (status + cost) Parse XML, Display Discover Applications (Software Resources) Monitor System Monitor Queues PT = portType RP = Resource Properties DTO = Data Transfer Object DB = Data Base National Center for Supercomputing Applications

  34. GC Client GMS Resources/Kits/DB GMS_WS Job Status Job Status jobDTO.status Job Launcher Status Update Estimate Start time Scheduler emails/ notifications Notifications: Client, email, IM National Center for Supercomputing Applications

  35. Retrieve Root Dir. Listing on MSS with CoGKit or GAT or “gsi-ssh” MSS query UserFiles RP + FileDTO object LoadFile PT GetResourceProperty PT FileDTO(?) LoadFile PT (project folder+job) Validates project folder owned by user. Send new listing Retrieve file: CoGKit or GAT or “gsi-ssh” API file request Store locally Create FileDTO Load into UserData RP RetrieveFiles PT (+file rel.path) GMS_WS File Retrieval (MSS) GC Client GMS Resources/Kits/DB Job Completion: Send Output toMSS GetResourceProperty PT PT = portType RP = Resource Properties DTO = Data Transfer Object MSS = Mass Storage System National Center for Supercomputing Applications

  36. RetrieveJobOutput PT(+JobDTO) Job Record fromDB. Running: from Resource Complete: from MSS Retrieve file: CoGKit or GAT or “gsiftp” GMS_WS File Retrieval GC Client GMS Resources/Kits/DB Create FileDTO (?) Load into UserData RP GetResourceProperty PT PT = portType RP = Resource Properties DTO = Data Transfer Object MSS = Mass Storage System National Center for Supercomputing Applications

  37. Logging Configuration Where to find Globus Where to get random seed for encryption key generation Classpath (required jars) • WSRF (Web Services Resource Framework) Compliant • WSRF Specifications: • WS-ResourceProperties (WSRF-RP) • WS-ResourceLifetime (WSRF-RL) • WS-ServiceGroup (WSRF-SG) • WS-BaseFaults (WSRF-BF) • %ps -aux | grep ws • /usr/java/jdk1.5.0_05/bin/java \ • -Dlog4j.configuration=container-log4j.properties \ • -DGLOBUS_LOCATION=/usr/local/globus \ • -Djava.endorsed.dirs=/usr/local/globus/endorsed \ • -DGLOBUS_HOSTNAME=derrick.tacc.utexas.edu \ • -DGLOBUS_TCP_PORT_RANGE=62500,64500 \ • -Djava.security.egd=/dev/urandom \ • -classpath /usr/local/globus/lib/bootstrap.jar: • /usr/local/globus/lib/cog-url.jar: • /usr/local/globus/lib/axis-url.jar • org.globus.bootstrap.Bootstrap org.globus.wsrf.container.ServiceContainer -nosec GridChem Web Services National Center for Supercomputing Applications

  38. GridChem Software OrganizationOpen Source Distribution • CVS for GridChem National Center for Supercomputing Applications

  39. GMS_WS • Package:org.gridchem.service.gms National Center for Supercomputing Applications

  40. GMS_WS + National Center for Supercomputing Applications Should these each be a separate package?

  41. gms Classes for WSRF service implementation (PT) client Cmd line tests to mimic client requests GMS_WS dao Data Access Obj – queries DB via persistent classes (hibernate) Data Transfer Obj – (job,File,Hardware,Software,User) XML dto How to handle errors (exceptions) exceptions CCG Service business mode (how to interact) model Contains user’s credentials for job sub. file browsing,… credential “Oversees correct” handling of user data (get/putfile). file file.task Define Job & util & enumerations (SubmitTask, KillTask,…) job job.task Autonomous notification via email, IM, textmesg. notification CCGResource&Util, Synched by GPIR, abstract classes NetworkRes., ComputeRes., SoftwareRes., StorageRes., VisualizationRes. resource user User (has attributes – Preference/Address) DB operations (CRUD), OR Maps, pool mgmt,DB session, persistence audit Classes that communicate with other web services query synch gpir Periodically update DB with GPIR info (GPIR calls) test JUnit service test (gms.properties): authen. VO retrieval, Res.Query,Synch, Job Mgmt, File Mgmt, Notification Contains utility and singleton classes for the service. util crypt Encryption of login password Mapping from GMS_WS enumeration classes DB enumerators gat GAT util classes: GATContext & GAT Preferences generation proxy National Center for Supercomputing Applications Classes deal with CoGKit configuration.

  42. Testing GMS_WS external jars • For XML Parsing • “Java” Document Object Model • Lightweight • Reading/Writing XML Docs • Complements SAX (parser) & DOM • Uses Collections** National Center for Supercomputing Applications

  43. GridChem Resources Monitoring http://portal.gridchem.org:8080/gridsphere/gridsphere?cid=home National Center for Supercomputing Applications

  44. GridChem ResourcesNew Computing Systems National Center for Supercomputing Applications

  45. Application Software ResourcesCurrently Supported National Center for Supercomputing Applications

  46. GridChem Software ResourcesNew ApplicationsIntegration Underway • ADF Amsterdam Density Functional Theory • Wien2K Linearized Augemented Plain wave (DFT) • CPMD Car Parinello Molecular Dynamics • QChem Molecular Energetics (Quantum Chemistry) • Aces3 Parallel Coupled Cluster Quantum Chemistry • Gromacs Nano/Bio Simulations (Molecular Dynamics) • NAMD Molecular Dynamics • DMol3 Periodic Molecular Systems ( Quantum Chemistry) • Castep Quantum Chemistry • MCCCS-Towhee Molecular Confirmation Sampling (Monte Carlo) • Crystal98/06 Crystal Optimizations (Quantum Chemistry) • …. National Center for Supercomputing Applications

  47. GridChem User Services • Allocation https://www.gridchem.org/allocations/index.shtml Community and External Registration Reviews, PI Registration and Access Creation Community User Norms Established • Consulting/User Services https://www.gridchem.org/consult Ticket tracking, Allocation Management • Documentation, Training and Outreach https://www.gridchem.org/doc_train/index.shtml FAQ Extraction, Tutorials, Dissemination Help is integrated into the GridChem client National Center for Supercomputing Applications

  48. Users and Usage • 242 Users under 128 Projects Include Academic PIs, two graduate classes And about 15 training users More than a 442000 CPU Wallhours since Jan 06 More than 10000 Jobs processed National Center for Supercomputing Applications

  49. Science Enabled • Azide Reactions for Controlling Clean Silicon Surface Chemistry: Benzylazide on Si(100)-2 x 1Semyon Bocharov et al..J. Am. Chem. Soc.,128 (29), 9300 -9301, 2006 • Chemistry of Diffusion Barrier Film Formation: Adsorption and Dissociation of Tetrakis(dimethylamino)titanium on Si(100)-2 × 1Rodriguez-Reyes, J. C. F.; Teplyakov, A. V.J. Phys. Chem. C.; 2007; 111(12); 4800-4808. • Computational Studies of [2+2] and [4+2] Pericyclic Reactions between Phosphinoboranes and Alkenes. Steric and Electronic Effects in Identifying a Reactive Phosphinoborane that Should Avoid Dimerization Thomas M. Gilbert* and Steven M. Bachrach Organometallics, 26 (10), 2672 -2678, 2007. National Center for Supercomputing Applications

  50. Science Enabled • Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D et al. J. Am. Chem. Soc. 2005, 127, 3140-3155. • The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488. • The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598. National Center for Supercomputing Applications

More Related