1 / 24

Grid Computing – Issues in Data grids and Solutions

Grid Computing – Issues in Data grids and Solutions. Sudhindra Rao. Outline. Grid Computing – introduction Computational Grids Data Grids Data Management Related Work Technologies – JavaSpaces, OceanStore Our research plan Discussion. What is grid computing?. Use a network of PCs

Download Presentation

Grid Computing – Issues in Data grids and Solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

  2. Outline • Grid Computing – introduction • Computational Grids • Data Grids • Data Management • Related Work • Technologies – JavaSpaces, OceanStore • Our research plan • Discussion OSCAR Lab

  3. What is grid computing? • Use a network of PCs • Faster networks, cheaper PCs, lot of idle time • Easy to build, maintain, scale • Generic solution for scientific and business problems alike • Some form of grid computing - SETI@Home, Argonne National Lab, Google etc. OSCAR Lab

  4. Market Dynamics Grid Computing New Opportunities Maturing Technology World Events Why today? Goals Efficiency Profitability Capabilities Security Manageablity Agility Control Uncertainty Complexity Distribution OSCAR Lab

  5. Compute- intensive analytics OLAP data analysis Data Center operations Compute Utility services • In-process system migration • High fault tolerance • Geographic data center independence for failover and business applications • Data center compute farms • Corporate compute utility • services creating a low-cost infrastructure similar to the electric grid • Anti-money laundering • Credit card (risk and customer • Data mining) • Billing • Value at risk • Credit risk • Real-time risk management • Automated trade programs Applications – data grids • Geographic distribution of data • Computations on large scale data OSCAR Lab

  6. Middleware Data queues Publish/Subscribe Smart routing File sharing CORBA Data translation Distributed Computing Evolution Pipes/sockets Clusters Data grids Utility service Grid Computing Client/Server Evolution of distributed computing OSCAR Lab

  7. Compute grid • Distributed pool of resources • Completing a task for a user • User requests and reserves resources • Some kind of middleware manages resources and tasks • Resilient and fault tolerant OSCAR Lab

  8. Compute grid – coordinating set of tasks Client Client Network pipe 1-1 connectivity Network pipe 1-1 connectivity Multiple applications/worker threads accessing single datastore Business AppServer Server Server Data Storage Data grid OSCAR Lab

  9. Compute grid – coordinating set of tasks Data grid – eliminates data access bottlenecks Data Storage Data grid – manages data OSCAR Lab

  10. Mechanism neutrality Policy neutrality Compatibility with compute grid Uniformity with information infrastructure Services Storage Service Grid storage API Metadata service Data grid architecture OSCAR Lab

  11. Expectations Coordination between compute and data grid Data delivery to facilitate task and resource management Sharing data distribution and location information Leveraging data locality Guarantees Dependability Consistency Pervasiveness Security Inexpensive Data grid architecture OSCAR Lab

  12. Monte Carlo Simulation OLAP Real-time datamart Level 1 Data Grid QoS Level 0 Batch Synchronous Static data Nontransactional Atomic Synchronous Static Data Nontransactional Atomic Asynchronous Static Data Nontransactional Atomic Asynchronous Dynamic data Nontransactional Atomic Synchronous Static data Transactional Atomic Asynchronous Dynamic data Transactional Atomic Asynchronous Static data Transactional Batch Synchronous Static data Transactional Application Complexity Work, Time, Data, Transactional Data delivery - QoS requirements OSCAR Lab

  13. Related Work • Grid File System - provides primitives like a file system – Level 0 QoS • NFSv4 – High performance, extensible, secure – in the works • Secure File System – self certifying paths, unique identifiers, global namespace, key based certification OSCAR Lab

  14. Technologies related to data grids - JavaSpaces “Make Room for JavaSpaces, Part IEase the Development of Distributed Apps with JavaSpaces” - Eric Freeman and Susan Hupfer OSCAR Lab

  15. OceanStore • Global replication of data • Promiscuously caches data • Version based archival storage • Applications can control their consistency requirements to manage performance • Internal event monitors analyze access patterns to move data and provide redundancy OSCAR Lab

  16. Grid Fabric - Integrasoft • Business solution provided for financial institutions, share traders • Designed to complement compute grid • Works closely with compute grid to schedule tasks based on data availability • Moves data closer to computation OSCAR Lab

  17. Business process Delivers has WebServices State Requires Data Grid SOA and Data grids • Moore’s law and Metcalf’s law • Network based computation and grid computing with SOA • Intelligent infrastructure – SONA OSCAR Lab

  18. Web 2.0 OSCAR Lab

  19. Our research – Motivation Issues in data management • Data tightly coupled to computation • Data cached locally • Distribution is haphazard and reuse is minimal • Data pulled by computation – not delivered • Mechanisms still improvise based on experience on smaller systems OSCAR Lab

  20. Grid DBMS Security Transparency Robustness Efficiency Intelligence Fragmentation Heterogeneity Data Grid and DBMS OSCAR Lab

  21. Data grid – eliminates data access bottlenecks Persistence Mechanism – with data regions indicates Replicas, relations Data Storage Data grids as extended DBMS OSCAR Lab

  22. Datacentric grids • Automated space management and garbage collection • Space and data objects lifetime mechanism • I/O allocation on storage system • Estimating access from Magnetic storage • Co-scheduling of compute and storage resources • Space reservation dilemma • Thin clients • Code mobility towards data OSCAR Lab

  23. Expected Results • Can we move computation closer to data? • Data grid –with features of persistence? • Performance improvement using tags? • Loosely coupled data grid and compute grid? • Scalability of unique naming in file systems? OSCAR Lab

  24. Thank you!

More Related