CSEP Collaboratory IT System Organization - PowerPoint PPT Presentation

Csep collaboratory it system organization l.jpg
Download
1 / 40

CSEP Collaboratory IT System Organization. Philip Maechling June 6, 2006. Overview. Hardware Software Stack Predictive Programs Algorithm Evaluation Data Access People Processes. Overview of CSEP IT System. CSEP Solutions to IT Challenges: Computers.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

CSEP Collaboratory IT System Organization

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Csep collaboratory it system organization l.jpg

CSEP Collaboratory IT System Organization

Philip MaechlingJune 6, 2006


Overview l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Overview of csep it system l.jpg

Overview of CSEP IT System


Csep solutions to it challenges computers l.jpg

CSEP Solutions to IT Challenges: Computers

  • Integration System: A flexible, accessible, collection of computers, network, and storage devices:

    • Scientists with predictive algorithms will install their software in the collaboratory and validate results.

  • Operational System: A stable, secure, highly available, restricted access system:

    • Once validated, algorithms are transferred to operational system and run for long periods of time.

  • Distribution System: Remotely accessible, highly available system for distribution of status and results.

    • CSEP operational system will be transparent in operation to allow external review


Hardware l.jpg

Hardware

Hardware Issues:

  • Build system based on modular, inexpensive hardware and storage that can be replaced if there are hardware problems.

  • Automated fault detection and testing needed.

  • Do we need separate computers for each algorithm?

  • Shared external processing cluster needed to perform evaluation processing will likely be USC High Performance Computing Center Cluster.


Overview6 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Software stack l.jpg

Software Stack

Standardized Software Stack on collaborator computers:

  • Linux

  • gcc C,C++, Fortran

  • Java 1.5

  • Matlab (licensing issues)?

  • Relational database and database access tools (Postgres or MySql)


Software stack8 l.jpg

Software Stack

Standardized Software Stack on collaborator computers:

  • Assuming we need processing cluster, software stack must include other software tools possibly including cluster tools (rocks), job scheduler (pbs), and possibly grid-tools.

  • Data management tools (copies of files (replicas)), database-oriented metadata management, support for multiple storage system, file naming and persistent file IDs) will be utilized.

  • Grid software could allow other testing centers to make use of USC cluster. SCEC can provide support with installation of these tools.


Overview9 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Csep solutions to it challenges programs l.jpg

CSEP Solutions to IT Challenges: Programs

Predictive Programs integrated into CSEP system through collaborative efforts of scientists and CSEP personnel.

  • Programs and data will be moved into CSEP Integration System.

  • Correct operation of Predictive Program will be established on Integration System. Predictive Program will be run until all are satisfied it is operating correctly.

  • Predictive Program will be transferred to Operational System. Access to operational system is restricted to CSEP personnel only.


Predictive programs l.jpg

Predictive Programs

Categories of Algorithms will have different run-time characteristics:

  • Grid-based

    • e.g. RELM and ETAS

  • Alarm-based

    • e.g. Keilis-Borok

  • Neither alarm nor grid-based

    • e.g. Acceleration Moment Release, Enhanced PI Method


Predictive programs12 l.jpg

Predictive Programs

Categories of Algorithms will have different run-time characteristics:

  • Quasi-Stationary Five Year and One Year Models

    • Evaluated Yearly - limited processing needed - RELM

  • Short-Term Time dependent Models (ETAS, STEP)

    • Evaluated Daily - Frequent execution of algorithm and evaluation of results. – (e.g. STEP)

  • Fault-based Models

    • USGS Hazard Maps

    • Acceleration moment release


Predictive programs13 l.jpg

Predictive Programs

Predictive Algorithms

  • Developed and contributed by researchers.

  • Programs in each category should produce standardized output data and format.

  • Possibly data intensive, requiring substantial internal data storage or data communications.


Predictive programs14 l.jpg

Predictive Programs

Minimum requirements for contributed programs will include:

  • Identification of all software dependencies

  • No usages of expensive external software

  • Open source

  • Configuration management and versioning

  • Good inputs, expected outputs

  • Written description of how to run program


Overview15 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Algorithm evaluation l.jpg

Algorithm Evaluation

Algorithm Evaluation Processing

  • Testing framework (aka coordinator program), to be developed by CSEP staff.

  • Run after predictive outputs have produced results

  • May be processing intensive (esp. daily time dependent forecasts) requiring high performance computing tools.


Algorithm evaluation17 l.jpg

Algorithm Evaluation

Algorithm Evaluation Processing

  • Goal is a “component-based” testing framework that would allow us to automatically run tests and easily see test results.

  • Component model would allow us to add new tests (plug-in) easily without modifying framework.

    • Example (JUnit testing framework)


Algorithm evaluation18 l.jpg

Algorithm Evaluation

Algorithm Evaluation Processing minimum requirements

  • Minimum software dependencies

  • No usages of expensive external software

  • Open source

  • Configuration management and versioning

  • Available for distribution

  • Users guide


Algorithm evaluation19 l.jpg

Algorithm Evaluation

Algorithm Evaluation Processing

  • Likelihood tests

    • Could based on RELM tools

  • ROC/Molchan Tests

    • Could be based on RELM tools

  • CT Test

    • Need to be developed


Algorithm evaluation20 l.jpg

Algorithm Evaluation

Algorithm Evaluation Processing

  • Expect to deliver diagnostics and algorithm results

    • Some health and status information available routinely to build confidence system is working properly

  • Delivering results of processing to users

    • Web pages

    • OpenSHA Interfaces?

    • Read only access to results database?


Overview21 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Data access l.jpg

Data Access

Distinguish between Input Data Set:

  • Authorized Data Sets:

    • Common to all algorithms and accessible in near-real-time, continuously, or time lagged (e.g. ANSS catalog).

  • Non-authorized Data Sets:

    • Must be pre-loaded at start of testing period and not changed during testing period.


Csep it capabilities l.jpg

CSEP IT Capabilities

Issues of reproducibility of results:

  • Authorized data streams may change over time.

    • Some data sources (e.g. fault data base) given a date can provide identical data even after a change.

  • Non-repeatable data access may required archiving of data in order to reproduce the results.

    • e.g. ANSS catalog archived daily!


Csep it capabilities24 l.jpg

CSEP IT Capabilities

Issues of reproducibility of results:

  • Proposed Authorized Data Streams

    • Raw ANSS Catalog

    • Declustered ANSS Catalog

    • DP Tagged catalog

    • Fault Database (National Seismic Hazard Map Fault DB)

  • Any data filtering done by system should be provided as a service to external groups for use in development


Overview25 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Csep solutions to it challenges people l.jpg

CSEP Solutions to IT Challenges: People

Role-based organization structure will be used so that significant Project responsibilities are always clearly identified with a Project participant: Types of roles in CSEP organization:

  • Scientific Leadership

  • Algorithm Scientific Lead

  • Algorithm Technical Contact

  • Project Manager

  • Chief System Engineer

  • Scientific Programmers

  • Visualization Programmers

  • Software Configuration Lead

  • Validation and Testing Engineers

  • Operational System Administrator

  • User Interface Development

  • Web Developer

  • Technical Writers

  • Public Relation and Outreach Development

  • System Administration/Network Admin/Grid Admin


Csep solutions to it challenges people28 l.jpg

CSEP Solutions to IT Challenges: People

CSEP Programmer Job Posted at USC:

Senior Software Engineer

The Southern California Earthquake Center is seeking an experienced software engineer for a key role in the development of a system for the study of the predictability of earthquakes. This software engineer will work closely with geoscientists to develop both the software tools and the software development environment required for this innovative new earthquake research system.

This individual will have a number of key responsibilities including:

  • working with geoscientists to determine their data processing requirements.

  • defining and establishing a repeatable software integration, deployment, and source code management process.

  • writing computer programs to meet system design specifications.

  • integrating scientific software programs into a standardized computing environment utilizing a standardized integration process.

  • developing data management techniques that ensure reproducibility of algorithmic results.

  • developing validation testing tools and techniques.

  • writing descriptions of system and algorithm performance.

    This position provides an outstanding opportunity to make significant contributions to leading edge earthquake system science research. This position requires both strong computer science and professional skills. This software engineer must be collaborative, service-oriented, accessible, and responsive to community needs through state-of-the-art and ever-evolving technology.

    Minimum Qualifications: Bachelor’s Degree with five years experience or Master’s Degree with two years experience. Combined experience/education as substitute for minimum education/experience.


Overview29 l.jpg

Overview

  • Hardware

  • Software Stack

  • Predictive Programs

  • Algorithm Evaluation

  • Data Access

  • People

  • Processes


Csep solutions to it challenges processes l.jpg

CSEP Solutions to IT Challenges: Processes

Processes: Well defined ways of performing tasks

Community Standards: Community-based agreements on some aspect of CSEP activity.

  • Not all processes need to be defined through a community standard, but creating standards should be done through a well-defined process.

  • Community Standards defined on a need and willingness to work basis.


Csep solutions to it challenges processes31 l.jpg

CSEP Solutions to IT Challenges: Processes

Standardized, repeatable processes may include:

  • Evaluation and qualification of predictive algorithms

  • Integration and validation of algorithms on CSEP systems

  • Transfer of Algorithms from integration system to operational system

  • Software configuration management

  • Data management

  • System administration

  • Software and system maintenance


Csep solutions to it challenges processes32 l.jpg

CSEP Solutions to IT Challenges: Processes

Evaluation and Qualification Process:

  • Collect standardized information about program:

    • Computer Hardware

    • Software packages used

    • Input Data

    • Output format

  • Review program computational and data requirements with geoscientists and computer scientists.

  • Identify Diagnostic information available


Csep solutions to it challenges processes33 l.jpg

CSEP Solutions to IT Challenges: Processes

Integration and validation of algorithms Process:

  • Build software on CSEP computers.

  • Check source code into Configuration Management system and version the software.

  • Define and save known input data set.

  • Generate and save known good output data set.

  • Create automated testing procedure.


Csep solutions to it challenges processes34 l.jpg

CSEP Solutions to IT Challenges: Processes

Standards processes are successful within communities (e.g. World Wide Web Consortium (w3c). However they required significant investment of time by many people and introduce significant time delays.

Suggest a CSEP standard process is defined, but only invoked as needed.


Csep solutions to it challenges standards l.jpg

CSEP Solutions to IT Challenges: Standards

Where do we need to define standards:

Data

  • Declustering algorithm

  • DP-tagging algorithm.

    Model

  • Standard format for time dependent result format.

  • ROC and Molchan diagram reporting format.

  • Alarm-based prediction result reports.

    Testing

  • Metadata attributes saved with forecast results

    Gray area (avoid standards process if possible): database selection, software version control system, operating system support


Csep solutions to it challenges standards36 l.jpg

CSEP Solutions to IT Challenges: Standards

SCEC Technical Consortium Process (candidate standards process definition) is based loosely on the W3C process was defined on CME project.

Two main concepts:

  • Standard Groups

  • Recommendation Process:


Csep solutions to it challenges standards37 l.jpg

CSEP Solutions to IT Challenges: Standards

Standard Groups:

  • Executive Board (3 people)

    • Defines the process and decides appeals

  • Technical Architecture Group (5 people) (Cyberinfrastructure Committee)

    • Reviews and approves (or rejects, or asks for modifications to) technical proposals

  • Working Groups (CSEP Standards Committees)

    • Develops technical proposal and send them to the TAG

  • Interest Groups

    • Reviews and comments on proposal


Csep solutions to it challenges standards38 l.jpg

CSEP Solutions to IT Challenges: Standards

Proposal Process:

Standard Proposal is submitted to TAG and can be accepted as first stage proposal as and “activity”. Accepted activities are in one of three states:

  • Activity

  • Working Draft

  • Recommendation

    Activity advances through development, working group and interest group review, and then back to TAG. Once it becomes approved as recommendation, it is considered a standard for the Project.


Slide39 l.jpg

End


Software stack40 l.jpg

Software Stack

Standard Metadata Approach:

Attribute_Name (Controlled vocabulary) - required

Attribute_Value (May include Range constraints) - required

Attribute_Unit (enumeration of units) - optional

Attribute_Type (float,string,integer,enumeration) - required

Event_Magnitude = 4.7 = Ml = float


  • Login