1 / 1

Scenario Results

Information Representation on the Grid: a Synthetic Database Benchmark/Workload for Grid Information Servers Beth Plale, Craig Jacobs, Ying Liu, Charlie Moad, Rupali Parab, Prajakta Vaidya, Nithya Vijayakumar Indiana University, Bloomington, USA. Motivation

kesler
Download Presentation

Scenario Results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Representation on the Grid: a Synthetic Database Benchmark/Workload for Grid Information ServersBeth Plale, Craig Jacobs, Ying Liu, Charlie Moad, Rupali Parab, Prajakta Vaidya, Nithya VijayakumarIndiana University, Bloomington, USA Motivation The Grid is new paradigm for wide area distributed computing. Management of grid resource information is complicated by fact that grid resources such as hosts, clusters, virtual organizations, people, mathematical libraries, software packages and services have aspects of their description that change at millisecond rates. This defining characteristic makes traditional directory service solutions inappropriate. Contribution • Understanding of resource information representation and retrieval in grid computing through development of grid-specific synthetic database benchmark/workload for grid resource information repository. • Application of database benchmark on three vastly heterogeneous “database” platforms: mySQL 4.0 (RDBMS), Xindice 1.1 (XML database) and MDS2 (LDAP database) • Performance metric that captures both tangible and intangible aspects of information retrieval: Response time for Queries and Scenarios, Ease of Use. Query Results Analysis • Scoped limits a search to a particular subtree. It appears to achieve its effect when collection size is large. • Indexed queries over an Indexed attribute. Comparison of Indexed and non-Indexed queries for relational and XML platforms reveals obvious conclusion that indexes greatly improve performance. • Selectivity is number of objects satisfying a query. From figures, we can see MDS2 is sensitive to returned set size. • Joins: “manyRelations” and “jobSubmit” join queries measure repository’s ability to assemble result from numerous collections. In comparing join and selectivity results for Xindice, join is much faster than selectivity. This is because selectivity queries is over very large collection. So we can conclude that Xindice is quite sensitive to collection size. • Update is a simple, one-attribute update. mySQL can do 200 updates per second whereas Xindice can perform only 1.3 updates per second. Grid Resource Benchmark/Workload Description Queries/updates are grouped into five categories: • Scoped – restrict query evaluation to a particular sub-tree • Indexed – request indexed attributes; • Selectivity – number of objects returned; • Joins – ask a realistic grid related question; • Update/Connection – update single attribute; simple connection request Scenarios are scripted synthetic workload issued over controlled time duration consisting of concurrent query and update/insert requests. Scenarios consist of three phases: • Phase I (6 minutes) – Execution of several concurrent query streams. • Phase II (5 minutes) – Update threads started along with query threads started in Phase I. • Phase III (9 minutes) – Update threads stopped. Query streams of Phase I still running. Scenario Results Ease of Use Ease of Use captures intangible aspects of performance of a grid service, in particular, amount of work client must undertake to obtain desired information. That is, number of queries user must issue and amount of processing required on returned data that falls upon user. Metrics used to quantify ease of use are number of bytes returned and number of queries needed in order to retrieve requested data. Database platforms requiring more queries to obtain data correspondingly return larger number of bytes. Release of IURGRbench benchmark v1.0 is currently available at website: http://www.cs.indiana.edu/~plale/RGR/

More Related