1 / 32

Grids Are Real: Avaki in the Life Sciences Sector September, 2002

Grids Are Real: Avaki in the Life Sciences Sector September, 2002. Avaki Corporation One Memorial Drive Cambridge, MA 02142 617.374.2500 www.avaki.com. Agenda. Avaki Background Commercial Requirements Grids in Commercial Environments AVAKI Grid Software Avaki and Standards

tyler
Download Presentation

Grids Are Real: Avaki in the Life Sciences Sector September, 2002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grids Are Real: Avaki in the Life Sciences Sector September, 2002 Avaki Corporation One Memorial Drive Cambridge, MA 02142 617.374.2500 www.avaki.com

  2. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  3. AVAKI Company Background • Comprehensive grid software • Development began in 1994 • Formerly Legion • First deployed in 1997 as NPACI-Net • Founded in 1998 • Funded by Polaris, General Catalyst, and Soffinova • Strong customers, leading partners, industry consortia Customers Standards Organizations Partners

  4. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  5. Situation • Competitive advantage depends on bringing new products to market faster • Increased automation in drug discovery and new product development has caused an explosion in data and in computation • Mergers, acquisitions, joint ventures, and partnerships are creating distributed and virtual organizations Competitive advantage depends on dynamically matching information technology resources – data, processing, and applications–with the individuals and organizations that depend on them.

  6. Challenges • Data Chaos • Numerous or large data sources • Data updated frequently and on different schedules • Accessed by users at multiple locations in different organizations • Increased Computational Load • Compute intensive and high-throughput applications • Spikes in demand for computing power • Complex staging requirements • Supporting Distributed and Virtual Organizations • Heterogeneous information technology resources • Multiple locations and administrative domains • Frequently changing requirements and system failures

  7. Requirements • End Users • Easy access to current, consistent data • Convenient access to applications and processing • Easy collaboration with colleagues and partners • Regardless of location, administrative domain, or platform • Applications often can not, or will not, be modified • MUST WORK WITH LEGACY APPLICATIONS • Increasingly Java, J2EE as execution environment • IT • Support requests for better access and more resources • Streamline data management • Enable more flexibility in the use of resources • Protect corporate assets and intellectual property • IT managers are overworked - and represent 40% of IT costs • Grids must simplify life, not make it more complicated!

  8. What you won’t see • MPI is very rare • MPI cross platform, cross site - no interest at all • Multi-site applications - basically no interest • Remote visualization - no interest • Fancy parallel schedulers - little interest • Desire to write new applications - almost nil • loads of bandwidth

  9. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  10. What is a Grid System? A Grid system is a collectionof distributed resources connected by a network. Examples of Distributed Resources: • Desktop • Handheld hosts • Devices with embedded processing resources such as digital cameras and phones • Tera-scale supercomputers

  11. What is a Grid? A grid is all about gathering together resources and making them accessible to users and applications. A grid enables users to collaborate securely by sharing processing, applications, and data across heterogeneous systems and administrative domains for collaboration, faster application execution and easier access to data. • Compute Grids • Data Grids

  12. Solution: The Grid-Enabled Enterprise Organizations are adopting grid computing to simplify how information technology resources are accessed, to more flexibly and fully utilize resources, and to reduce manual system administration. By grid-enabling, enterprises can: • Establish more productive IT environments for end users, environments that address the complexities of today’s work • Streamline key processes and facilitate essential collaboration within and between companies • Simplify administration and management of IT resources • Fully utilize existing resources and avoid capital expenditures • Support wider range of infrastructure choices and more easily migrate to new technologies • Provide secure access to resources while supporting the protection of intellectual property

  13. Grid Computing Scenarios AVAKI Grid Software – Compute and Data Grid Partner Grids • Multiple owners, sites, domains • Multiple file systems • Internet connectivity Campus/Enterprise Grids • Multiple owners, domains • Multiple file systems • WAN connection Cluster Grids • Single owner, department, project • Single domain, file system • LAN connection Desktop Cycle Aggregation • Limited acceptance in commercial enterprises

  14. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  15. AVAKI Grid Software – Compute and Data Grid Capabilities HQ-1 PM - 1 R D - 2 Compute Data Genbank Swisprot - SRS Results_01 Enterprise Users Partner Users • Unifies compute, data and application resources • Single, global namespace • Secure access • Simplified administration • Failure detection and restart Queuing System Queuing System Desktops Server Shared Data Shared Output Cluster Server Server Shared Data Sources Partner Enterprise IT Departments User Departments

  16. AVAKI 2.5 Data Grid Data Genbank Swisprot - SRS Results_01 Enterprise Users Partner Users • Federates multiple data sources • Provides access to data in local and virtual file systems (DAS, NAS, SAN) • Provides access to shared data through standard interfaces • Caches data locally Queuing System Queuing System Desktops Server Shared Data Shared Output Cluster Server Server Shared Data Sources Enterprise Partner IT Departments User Departments

  17. Avaki Data Grid – Data Mapped to the Global Namespace Windows 2000 • Links directories and files from source location to data grid directory and user-specified name • Generates location independent grid name • Presents unified view of the data across platforms, locations, firewalls, administrative domains, and data owners Solaris Linux Partner Enterprise IT Departments User Departments

  18. Avaki Data Grid – Access Data Data Data Data Genbank Genbank Genbank Swisprot - SRS Swisprot - SRS Swisprot - SRS Results_01 Results_01 Results_01 Enterprise Users Partner Users • Access using standard NFS protocol or Avaki commands • Access using user specified name • Access based on specified privileges • Single log-on for shared data access • Aggressively caches data locally AVAKI Data Access Server AVAKI Data Access Server Cached Copy Cached Copy Queuing System Queuing System Desktops Server Shared Data Shared Output Cluster Server Server Shared Data Sources Partner Enterprise IT Departments User Departments

  19. AVAKI 2.5 Data Grid Benefits • Requires no changes to applications or the way users typically access the data • Easy, convenient, wide-area access to data – regardless of location, administrative domain or platform • Provides consistent access to the most recent data available • Eliminates the need to create and maintain multiple copies • Caches remote data locally for high performance • Protects data with fine-grained security • Eases data administration and management

  20. AVAKI 2.5 Compute Grid HQ-1 PM - 1 R D - 2 Compute Enterprise Users Partner Users • Federates heterogeneous compute resources • Easy integration to third party queuing systems • Identifies “appropriate” resources • Automatically stages data and applications Resources Queuing System Queuing System Desktops Server Shared Data Shared Output Cluster Server Server Shared Data Sources Partner Enterprise IT Departments User Departments

  21. Avaki 2.5 Compute Grid Benefits • Easy, convenient, wide-area access to processing resources – regardless of location, administrative domain or platform • Eliminates time-consuming searching for available processing cycles • Executes jobs more efficiently • Better utilizes existing resources helping avoid capital expenditures • Supports flexible usage of resources as required for changing requirements and system failures • Requires no changes to legacy or commercial applications • Protects resources with fine-grained access control • Eases system administration and management • Improves capacity management and planning

  22. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  23. Commitment to Standards: GGF Commitment: • Respect for the standards: AVAKI will deliver the first and best commercial implementations of the OGSA/I standards • Respect for customer investments: AVAKI will interoperate with other OGSA-compliant apps (following ratification), including Globus Background: • AVAKI taking a visible, active role at the Global Grid Forum (GGF) • Andrew Grimshaw on GGF Steering Committee • AVAKI engineers active in OGSI, OGSA, and numerous other Working Groups • Contributed Secure Grid Naming Protocol (SGNP) to OGSI WG • Spec for scalable naming of grid entities, ability for such entities to communicate securely and reliably in spite of migration, replication, failure, etc. • Public expressions of support from IBM, H-P/Compaq, Sun, Platform

  24. Commitment to Standards: I3C The I3C facilitates and enables data exchange, data management, and knowledge management across the entire life science community by promoting common protocols that ensure interoperability in an open, consistent and robust manner • AVAKI is a member of the I3C, alongside IBM, and is active on the Technical Architecture Committee • AVAKI co-authored LSID: a naming standard for distributed data • Distributed, biologically significant data items • Files, database records, and data objects managed by N-tier applications • Accessible over public and/or private networks • Owned, managed, and/or curated by different organizations • Joint demo with IBM, others, at Bio 2002 (July ‘02) • Integrates LSID to uniquely identify objects & data elements in a distributed, federated fashion

  25. Agenda • Avaki Background • Commercial Requirements • Grids in Commercial Environments • AVAKI Grid Software • Avaki and Standards • Avaki in Life Sciences

  26. Industry Problem: Increasing Cost and Complexity of Life Science Data-sharing Genbank growth has continued to trend sharply upwards, as have many other classes of biological data • Over 400 public Life Sciences databases • 8x growth in genomics data, last 18 months. And this is just the beginning … • Proteomics data: 1,000x multiplier2 • Glycomics, new small molecule efforts • Increasing scope of data diversity • Annotations (interactions) • Organism-specific (mouse, human) • Molecule-specific (protein, sugar) • Data-type-specific (gene expression) • Increasingly complex data interrelationships Increasingly-complex interrelationships between biological research databases today (LION graphic) 1 Source: TimeLogic estimates. 2 Frank Gleeson, CEO, MDS Proteomics

  27. Tape Public DB Public DB Public DB Enterprise IT Problem: Life Sciences Data Management Varying Media • Multiple research groups, domains • Each dept or site acquires & manages its own data • Coherence issues among researchers • Bandwidth costs • Multiple FTEs allocated to data management efforts CD FTP Web Portal Tape Location 3 Location 2 SEQ_1 SEQ_3 SEQ_2 APP 1 APP 2 Bioinformatics Research External Partner External Partner Pharmaceutical Company

  28. Internal Data Partner Data Public Data Data Management Solution -- Using Avaki Data Grid • Multiple data sources • One authoritative copy • Consistent data across sites • Automated process eliminates manual and duplicated effort • Write-through cache supports sharing user-created data • “Data Access Problem” solved Central IT Avaki Data Cache Avaki Data Cache Avaki Data Cache Avaki Data Cache Avaki Data Cache Enterprise/Partner Sites

  29. 1 2 n Queuing System Cluster Server Partner Resource Availability and Access Problem • Some resources are at maximum capacity while other resources are underutilized • Different user interfaces • Multiple queuing systems • Multiple log-ons and complex UID management • Policy and security needs make sharing difficult 1 2 n Queuing System Queuing System Solaris Workstations WIN2000 Workstations Linux Cluster Linux Cluster Server Servers Enterprise

  30. Resource Availability and Access Solution – Using Avaki Compute Grid • Load balanced across resources for improved utilization • Single log-on to run jobs • Single user interface • Single set of commands to access all resources • Usage policies make sharing easy and secure Avaki Log-on/Commands Local Usage Policies Local Usage Policies Local Usage Policies Queuing System Queuing System Queuing System WIN2000 Workstations Solaris Workstations Linux Cluster Linux Cluster Cluster Server Server Servers

  31. Summary

  32. The AVAKI Difference • AVAKI 2.5 combines data and compute grid capabilities • Provides wide-area access to data, processing and application resources, while protecting corporate assets and intellectual property • Supports complexities of enterprise and global grids • Simplifies system administration and management • Packaged to be deployed quickly • Integrated architecture that requires no additional development • Requires no changes to legacy or commercial applications • Comprehensive support • Design, installation and configuration • Performance tuning • Customer support

More Related