1 / 27

SA1 Status Report

SA1 Status Report. Status and Progress of the ETICS Services. ETICS2 First Review. Alberto AIMAR. CERN. Brussels 3 April 2009. Outline. Tasks and Deliverables Features and Achievements Outlook on Year 2 Conclusions. Tasks and Deliverables. SA1 Tasks .

Download Presentation

SA1 Status Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SA1 Status Report Status and Progress of the ETICS Services ETICS2 First Review Alberto AIMAR CERN Brussels3 April 2009

  2. Outline • Tasks and Deliverables • Features and Achievements • Outlook on Year 2 • Conclusions SA1 Status Report

  3. Tasks and Deliverables SA1 Status Report

  4. SA1 Tasks • SA1.1 – Work Package Coordination • Regular coordination of the Work Package, reporting and review of milestones and deliverables • SA1.2 – Core Service Maintenance and Extensions • Maintenance of ETICS core services, fixing reported bugs and new requirements for core services. • Design and implementation of integrated release management tools, federated repository and APIs. • Integration into core services of contributions from SA2 and JRA activities • SA1.3 – Core Service Documentation • Maintenance and updating of the core service documentation • SA1.4 – Infrastructure Deployment, Maintenance and Upgrades • Extending current deployment strategies of ETICS services and infrastructure management, such that the service maintains a high-level of quality and availability • Load-balancing deployment for solicited and/or critical services • Improving coverage of self-monitoring system of deployed services and underlying infrastructures • SA1.5 – Core Service Certification • Applying the proposed ETICS Certification Process to the ETICS software itself • Demonstrate its applicability and that the ETICS service provides required information to determine the level of certification of the software projects SA1 Status Report

  5. SA1 Deliverables • DSA1.1 – Execution plan for first 12 months of infrastructure operation M03 • This deliverable describes the execution plan for the first half of the ETICS 2 project, including the core service roadmap and the infrastructure deployment plan • DSA1.2 – ETICS Core Services Design Specification M06 • This deliverable describes the overall architecture of the ETICS 2 core services • DSA1.3 – ETICS Site Service Level Agreement M09 • This deliverable describes the Service Level Agreements upon which the ETICS service will be provided. The SLAs will define the service level the users can expect from the service in terms of availability, accessibility and support • DSA1.4 – Execution plan for second 12 months of infrastructure operation M12 • This deliverable describes the execution plan for the second half of the ETICS 2 project, including the core service roadmap and the infrastructure deployment plan • DSA1.5 – Infrastructure and core services certification and usage report M21 • This deliverable reports on the release management cycles and certification of the ETICS 2 infrastructure and core services, including lessons learned and corrective action to apply SA1 Status Report

  6. Features and Achievements SA1 Status Report

  7. ETICS SA1 Services SA1 Status Report

  8. . Release and Development InfrastructureSA1.2 – Core Service Maintenance and Extensions • Production Installation (prod) • The officially released supported ETICS • Release Candidate Installation (rc) • “Next” production, available for final certification & testing by selected users • Integration Testing Installation (test) • All the release candidates of the packages are tagged at project level and installed for integration tests • Development Installation (dev) • A shared installation where developers can test their packages with the release candidates of other packages • Individual Development (dev-...) • Installations: developers or teams can instantiate they infrastructure, often in reduced scale for individual development and testing SA1 Status Report

  9. Infrastructure Monitoring SA1.4 – Infrastructure Deployment, Maintenance and Upgrades • Monitoring and Alarms System • Integrated in the CERN Monitoring System (web, sms, messaging, etc) • More in Year 2 SA1 Status Report

  10. Functional Regression Testing SA1.2 – Core Service Maintenance and Extensions • Integration and Release Candidate • Defined jobs, testing different functionalities and platforms SA1 Status Report

  11. ETICS Resource Pool(s) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Installation and Maintenance of Resources Pools Worker nodes for build and test jobs on all platforms supported (>80 WNs) SA1 Status Report

  12. ETICS Production Platforms SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Installation, Maintenance, Security Upgrades of Prod Platforms Install worker nodes for build and test jobs in production Make all external software packages available on each platform SL5 platforms for gLite Several other platforms for porting and testing SA1 Status Report

  13. Usage of the Resources SA1.4 – Infrastructure Deployment, Maintenance and Upgrades SA1 Status Report

  14. ETICS Client Performance SA1.2 – Core Service Maintenance and Extensions • Client 1.4 Released Improved performance from 200% to 900% depending on the task to be executed and the available hardware Very important for developers but also for remote execution The original XML-based implementation did not scale, new implementation is based on sqlite, the de-facto standard in multiplatform embedded database engines SA1 Status Report

  15. Worker Nodes Virtualization SA1.4 – Infrastructure Deployment, Maintenance and Upgrades • All ETICS Worker Nodes are Virtual Machines • CERN moved to double 4-core nodes (8 cores/each machine) in summer 2008 • ETICS had to move to virtual images because 8-core WN are not useful for build and tests • WN in the ETICS pool are (2-cores) VMS • Static creation of virtual machines • Prepared a Library of Virtual Images • Provide all maintenance, security updates, etc • Very Flexible Infrastructure • Instantiate new machines or change platforms with a few commands • Further Improvement • ETICS bootstrapper will download and start a virtual machine directly on the WN (for using other hw infrastructures). Possibly in Year 2. SA1 Status Report

  16. ETICS Repository Improvements SA1.4 – Infrastructure Deployment, Maintenance and Upgrades • The ETICS Repository has been reorganized • Major important improvements • Scalable and faster statistics • New versions of tools used (Java, etc) • New browser interface and addressing based on REST • Presented to the user with a more intuitive tree of directories and files with icons. • Reports and the packages are now stored on a HA file system (AFS) SA1 Status Report

  17. Integration with External Repositories SA1.2 – Core Service Maintenance and Extensions • Generation of RPM and Tar packages already available • The Debian users and gLite needed other distribution formats • Dynamic YUM Repositories were requested • Glite uses YUM repositories as distribution mechanism • Permanent YUM repository for registered repository • Using the standard YUM client all binaries can be deployed • Further improvements in Y2 of the Projects • Feasibility and prototypes of integration • Driven by the need of a sustainable future for ETICS SA1 Status Report

  18. ETICS Web Client SA1.2 – Core Service Maintenance and Extensions • ETICS Build and Test Portal (restarted Sept 2008) • Improved the External Requests and Submission web interface • Y2: Streamline interface for repetitive non-expert tasks (re-run build, test, etc) vs. expert tasks (new package, configuration, etc) • Web Build and Test Application(restarted Oct 2008) • Porting to Firefox 3 was the major improvement • Fixing bugs in the Web Apps • Changes required by others (multi-packaging, etc) • Disseminator (restarted Oct 2008) • Deployed on an internal INFN machine to be tried and tested, • Need to be completed as the metrics are a cornerstone of many ETICS activities (Plug-ins, QA, A-QCM, gLite) • Not many resources for this fundamental component until Oct 2008 SA1 Status Report

  19. Service Level Agreement (I) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades • The ETICS SLAs describe the terms of Quality and Availability of Services • Quality of Services • Installation- Installation procedures are present and used regularly to instantiate development services. Complete Installation and testing will take less than 24 hours because it includes OS installation on servers, restore from backups, security certificates, firewall settings, etc • Backup - Backup sets are generated using the standard CERN proceduresServer backups are performed every night All permanent files are stored on AFS, a mirrored and archived central file system at CERN. • Restore - Restoration of full ETICS Services should take up to one day Providing the availability of hardware and of the above mentioned services (AFS, TSM, networking, certificates, etc) that are used by the ETICS Services • Redundancy - The Service is not redundant but the servers can be restored in a few hours All the hardware is standard commodity material or is virtualized, we can easily find hardware and prepare new sets of ETICS Services • Virtualization - All Worker Nodes are VM-based and are fully redundant for the platforms needed by the current ETICS users (SLC4/5, RH, and Debian operating on 32- and 64-bits platforms) • Supply - All ETICS hardware is CERN standard, but if needed it can be purchased at any IT store, provided that the hardware supports SL4, which is the standard Linux OS used at CERN • No foreseeable supply problems in case of urgently needed hardware. • Software Dependencies - The software used by the Services is all widely-used open source and therefore there is no danger of lacking supply or licenses availability SA1 Status Report

  20. Service Level Agreement (II) SA1.4 – Infrastructure Deployment, Maintenance and Upgrades Availability and Reliability Targets For accessing different artefacts on the Build and Test processes In Year 2 we will complete 2 SLA documents (gLite and D4Science) SA1 Status Report

  21. Outlook on Year 2 SA1 Status Report

  22. Improvements of ETICS Infrastructure DSA1.4: Execution Plan for the Second 12 Months • Milestones for Y2 of the ETICS2 projects are oriented towards a sustainable ETICS • Disentangle the ETICS Services from the current solutions based on the current partners • Services that can be based on, or interfaced to, external resources and organizations. • Adding features that are typically needed by commercial User Projects. • M14 - Definition of Monitoring Parameters for the ETICS Infrastructure and Resources • Parameters to be collected and monitored in the infrastructure and resources of the ETICS Services These metrics will be used for monitoring and reporting about the ETICS Services. • M18 – Extended Monitoring for Other Infrastructures • General monitoring interface should be implemented and support some of the common monitoring and messaging system used at the Grid sites • M22 - Reports on the Usage of the ETICS Infrastructure • Automated reports with detailed usage of the ETICS Infrastructure. • Load on the servers and WNs by User Projects and platforms should be available. • Initial step to define the costs associated to User Projects and to the support of different platforms. • M15 - Security Assessment of the ETICS Services • Aspect regarding security of the ETICS Services should be assessed and certified following the current standards in place at major Sites • Describe security status of the ETICS Services and, if necessary, the changes to undertake. • This report may introduce an additional milestone to implement the required changes SA1 Status Report

  23. New Features of the ETICS Services DSA1.4: Execution Plan for the Second 12 Months • M16 - Metrics Disseminator for Trend Analysis (delayed from Y1) • The ETICS Services will be able to collect the metrics and display their results for any User Project. The disseminator should provide a web interface to select metrics data and customize its visualization for trend analysis plots. • M18 - Support of Distributed Multi-Node Testing (delayed from Y1) • The ETICS Services will be extended to map test definitions generated by the distributed testing design tools implemented by the ETICS 2 JRA2 work package (Test Management Tools). • M17 - Feasibility Study of ETICS integration with External Resources (delayed from Y1) • The feasibility study will investigate the possibility of connecting the ETICS Services to establish external code repositories (e.g. Sourceforge, Google Code, etc) and to computing resources to use as submission engines (e.g. Amazon EC2, Google App Engine, etc). • M20 – Test of Integration of ETICS Services with External Resources • For the sustainability of the ETICS Services beyond the ETICS 2 project, it is important that external computing and storage resources can be used for all components of the Services. An advanced prototype should validate whether it is possible to run the ETICS Services on completely external commercial resources. • M14 - SLA document defined with 2 major User Projects (delayed from Y1) • The SLA framework that was delayed at M12 should be implemented for 2 major ETICS User Projects and clearly specify the level of support and quality of the services provided. • The two User Projects are most likely going to be gLite and D4Science. SA1 Status Report

  24. New Features of the ETICS Services DSA1.4: Execution Plan for the Second 12 Months • M17 - Interfaces to Existing Build Systems • Currently the structure of a User Project and its entire configuration must be described in the ETICS Services in order to profit from all the functionalities of the ETICS System. Existing large projects find it difficult to integrate ETICS into their existing processes as they already have their configuration and build tools. They need to be able to use ETICS with as little effort and changes as possible. • Therefore interfaces and import tools will facilitate the usage of the ETICS System to established projects. For example, interfaces should be provided for popular existing configuration tools (i.e. Maven for the Java community, CMT for the Physics community, etc). The structure and the tags of the modules of a project may be imported from the source code structure (e.g. in CVS, SVN, etc). • M18 - Implementation of New Privacy, Authentication and Authorization Policies • The current ETICS Services have been developed focusing on the needs of open sources and research projects where authentication and authorization were considered necessary but not privacy. • Stricter policies concerning privacy of source code, reports and binaries must be implemented by the ETICS Services. In addition authentication, currently performed via certificates, must be possible via other methods of common usage like “username/password” identification. • M21 - Web Task-Oriented Interfaces • Currently ETICS has two user interfaces: a command line interface and a Web-based interface. Both interfaces provide access to all functionalities and all options of the ETICS operations in a very complete, but also quite complex, manner. • A simpler task-oriented interface must be provided for the main common operations of the ETICS users (i.e. register, add, build, test, etc) where, for instance, the most typical options are already set or selected by the project administrators SA1 Status Report

  25. Integration with Other Activities DSA1.4: Execution Plan for the Second 12 Months • M17 - A-QCM Certification of the ETICS Software (delayed from Y1) • The ETICS Services should be certified at level 2 for each of the four quality aspects in the A-QCM quality standard, in view of reaching level 3 at the end of the ETICS 2 project as defined in the ETICS 2 Description of Work. • M18 - Integration of ETICS Services with gLite and UNICORE (part2) • Currently the ETICS Services submit the build and test jobs to Worker Nodes managed by the Condor Metronome submission system. It is necessary to extend the possibility to use other resources in addition to the ones currently available, and provide a general interface allowing the plugin of new submission engines. • In the previous year a working, but limited, implementation has been developed. In the second year, a more complete integration in the ETICS Services of job submission system will be implemented using the EGEE/gLite and DEISA/UNICORE middleware. This will allow the submission of ETICS build and test jobs on these grid infrastructures. This milestone depends on the availability of the submission plugins that will be provided by the SA2 activity. • M16 - Integration of Testing and Metrics Collectors Plugins • Several testing and metrics plugins are under development at the end of the first year of the ETICS 2 projects. These plugins should be integrated into the ETICS Systems so that User Projects can select which tests execute and which metrics should be collected during the build and test processes of modules part of the given User Project. • M18 - Integration of Reporting Plugins • Reporting facilities are under development at the end of the first year of the ETICS 2 projects. They should be integrated into the ETICS Systems so that User Projects can select which reports should be generated for the modules part of the given User Project. Export facility (i.e. CSV, XML formats) will allow the presentation of the metrics collected with ETICS with external applications (i.e. Excel, etc). SA1 Status Report

  26. Risks DSA1.4: Execution Plan for the Second 12 Months • Risk 1 - Lack of human or material resources needed at the sites • This risk can be mitigated by looking for new resource sources; for example contribution of the User Projects to the HR or HW resources of ETICS. If unavoidable one could define SLA agreements that control the usage of the ETICS Services by every given project (e.g. number of builds per day). • Effort must be put in ensuring the commitment of the partners in providing hardware resource by specific contractual Consortium Agreements. • Risk 2 - Difficulty in Integrating the Work of Other ETICS 2 Activities • Issues incurred by SA2, JRA1 and JRA2 could affect their ability to deliver their results to SA1, thus limiting the availability of these features in production on the ETICS Services. The risk can be mitigated by constant training, common work and frequent checkpoints of the SA1 work and one of the other ETICS 2 activities. • Risk 3 – Still gLite-centric ETICS Services and Infrastructure after ETICS 2 • The needs of a major project such as gLite can sometimes be in conflict, or competing for other features, with those needed to provide a set of ETICS Services sustainable after the end of the ETICS 2 project. The input from other user communities highlights the need of a higher degree of privacy and data protection and extensibility than those currently required by the gLite project. • If in the second year of the ETICS 2 project the external needs are not taken more into account, it is unlikely that the ETICS Services will have a sustainable future. SA1 Status Report

  27. Conclusions and Summary • Main Objectives (and Additional Achievements) of the First Year • Automation, Performance, Virtualization, High Availability • Lack of resources for the first 6 months cause delays on some tasks but will be recovered in Year 2 • Maintenance and Upgrade of the Services • Platforms, Updates, Virtual Images, External Software • Urgent Requests for Main User Projects • Year 2: Focus on Sustainability • Usage of External Resources for CPU (WN) and Storage • Add Features needed: Privacy, Authorization, etc • Interface to/Import from popular configuration systems • Integrate SA2 Submission Engines • Testing and Metrics Plugins from JRA2 • Cross Site submission with JRA1 • A-QCM Certification and Reports with NA2 SA1 Status Report

More Related