1 / 54

Economics of Computations and Job-Specific Service Level Agreements

Economics of Computations and Job-Specific Service Level Agreements. Bin Li. and Dr. Lee Gillam. Department of Computing, FEPS. Outline. Part I Service Level Agreement SLA definition SLA type Non-Negotiable : AWS, GAE Examples Negotiable: job-specific, task oriented

keenan
Download Presentation

Economics of Computations and Job-Specific Service Level Agreements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Economics of Computations and Job-Specific Service Level Agreements Bin Li. and Dr. Lee Gillam. Department of Computing, FEPS

  2. Outline • Part I • Service Level Agreement • SLA definition • SLA type • Non-Negotiable: AWS, GAE Examples • Negotiable: job-specific, task oriented • Simple service brokerage use case • Aim: to build job-specific comparison service for computational market • SLA frameworks • Proposed SLA structure: based on WS-agreement standard • Service level characteristics • Availability, performance, autonomic, security • Potentials in computational market • Motivations and literatures

  3. Service Level Agreements (SLAs) • Level of service is formally defined between service provider and service consumer • Legal service contract: rights and liabilities. • Provider: reputation • Consumer: trust basis • What the services will deliver? • How the services are used? • Choose which provider? • Legal agreement document • Services description • Requirements • Charges • Legal issues (rights and liabilities) • Penalty / Compensation 3

  4. SLAs cont. • SLA Type • Non-Negotiable: • Pre-defined, abstract and obscure, • In favor of provider, general provider liabilities are documented to satisfy the most common requirements of consumer, • Least rights for the consumer, • Common in Cloud service • Involved penalty: usage credits or stop using service • Negotiable: • Legal document, long and boring with lots of legal terms, difficult to understand

  5. Google Apps (SLAs) (standard edition agreement) … ANY USE (of APP service) THEREOF SHALL BE AT CUSTOMER'S OWN RISK. GOOGLE AND ITS LICENSORS MAKE NO WARRANTY OF ANY KIND … NON-INFRINGEMENT. GOOGLE ASSUMES NO RESPONSIBILITY FOR THE PROPER USE OF THE SERVICE. … GOOGLE MAKES NO REPRESENTATION THAT GOOGLE (OR ANY THIRD PARTY) WILL ISSUE UPDATES OR ENHANCEMENTS TO THE SERVICE. GOOGLE DOES NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE SERVICE WILL BE UNINTERRUPTED OR ERROR FREE. (SLA) During the Term of the applicable Google Apps Agreement, the Google Apps Covered Services web interface will be operational and available to Customer at least 99.9% of the time in any calendar month (the "Google Apps SLA"). If Google does not meet the Google Apps SLA, and if Customer meets its obligations under this Google Apps SLA, Customer will be eligible to receive the Service Credits (not money back but 3 to 15 days longer service) ... Customer must notify Google within thirty days from the time Customer becomes eligible to receive a Service Credit. Failure to comply with this requirement will forfeit Customer’s right to receive a Service Credit. 5

  6. Amazon Web Service (SLAs) (EC2) AWS will use commercially reasonable efforts to make Amazon EC2 available with an Annual Uptime Percentage (defined below) of at least 99.95% during the Service Year. In the event Amazon EC2 does not meet the Annual Uptime Percentage commitment, you will be eligible to receive a Service Credit. (S3) AWS will use commercially reasonable efforts to make Amazon S3 available with a Monthly Uptime Percentage (defined below) of at least 99.9% during any monthly billing cycle (the “Service Commitment”). In the event Amazon S3 does not meet the Service Commitment, you will be eligible to receive a Service Credit (10% to 25% of you monthly billing). “The test for commercially reasonable efforts is less stringent than that imposed by the ‘best efforts’ clauses contained in some agreements.” -- http://definitions.uslegal.com/c/commercially-reasonable-efforts/ To receive a Service Credit, you must submit a request (i) include your account number … (ii) include … the dates and times of each incident of Region Unavailable that you claim to have experienced including instance ids of the instances that were running and affected during the time of each incident; (iii) include your server request logs … (iv) … within thirty (30) business days … 99.95% availability = 0.178days/year down = 4.3 hours/year down

  7. SLAs cont. ---- Job-specific SLA • SLA Type • Negotiable: • End-User with critical data or applications requirements, • Representing more flexible user requirements • Job/Application-specific, task-oriented • Handel manually: inefficient • Dynamic SLAs • Job-/Application-Specific SLA • Can be applied to both types of SLAs • Describe services of particular submitted task • Server management: automatically and dynamically (autonomic) create SLAs while the user demand changes, per-job SLA , different from ITIL (a general continual SLA) • Concept and practice of SLA brings the notion of risk management into computational market • System performance monitoring: system availability, forecasting • Ensure QoS: Act as a contract between providers and users, negotiate with brokers. • Clarifies the business nature and parties’ obligations 7

  8. Simple Service Brokerage Use Case

  9. Use case cont. 9

  10. Use case cont.

  11. Objective • Aim: provide the same kind of comparison service for compute resources (Cloud service). • Goods: compute service which is job-specific . • Retailers: resource providers (Amazon, Rackspace, Microsoft, Google). • Invoice: SLA. • Other factors: • Availability (risk or availability confidence), • Insurance, • Price • Penalty • etc. • Key: machine readable (automation, autonomic and efficiency) 11

  12. SLA Frameworks &WS-Agreement Structure • TWO FRAMEWORKS: • Web Service Agreement (WS-Agreement. OGF): GRAAP, part of Service-Oriented Architecture (SOA), XML syntax, Machine readable. • Web Service Level Agreement (WSLA): IBM • Cloud computing use case group • WS-Agreement: • SDTs: identify the work to be done • the required platform; • the software involved; • the set of expected arguments; • input/output resources; • etc. • GTs: provide assurance between provider and consumer on quality of service (QoS) • price of the service; • insurance price; • the probability of failure; • the penalty for failure; • the starting time • the probability of completion; • etc. Structure Xml example

  13. Cloud Service Levels (Characteristics) • Availability: (how often the service can be accessed over a time horizon) • Numbers of “Nines”: • S3: monthly: 99.9% availability = outrage 43.2 minutes/month • Who should define unavailability? • EC2: “Unavailable” means that all of your running instances have no external connectivity during a five minute period and you are unable to launch replacement instances. • Job-specific: future resource availability • Reliability: how well consumer trust the provider • Related to availability, but slightly different; consumer opinion • Combine cloud offerings: great power and flexibilities but less reliability • Confidence Level: How confidence the provider itself with its availability “nines”? • Job-specific: probability of (job) completion • Performance: • Throughput: how quick the service respond; • Load balancing: how the overload is avoid; • Elasticity: ability of growing infinitely with limitations; • Linearity: the system performance as workload increases; • Agility: how quick when respond to scaling up or down; • Data durability: the likelihood of data loss; • etc. • Autonomic: monitoring, automation and dynamic, machine readable. • Security: privacy, data encryption, legal issues 13

  14. Grid, Utility, Cloud…… Computing Potential computational market

  15. Grid, Utility, Cloud…… Computing Potential computational market Biggest structure change in IT since 1960s. TechMarketView: by 2012, uk software market 15% will be delivered by Cloud. (22% are applications) Computational Market 16

  16. Grid, Utility, Cloud…… Computing Potential computational market Computational Market Economics Issues ......................................absent: Pricing, Liability, etc. Service Level Agreements Risk Assessment

  17. Grid, Utility, Cloud…… Computing Potential computational market Computational Market Economics Issues ......................................absent: Pricing, Liability, etc. Service Level Agreements Firms PoD Risk Assessment Resource Monitoring Time series Analysis .............. ..................... .................. .............. ................................. ............................... ........Analysis Analogy ........ ....... ........................... Derivatives Risk Ana Resource PoF Financial Derivatives Financial Risk Management Measures Financial Market 18

  18. Background and literature: • Financial Grids: • Macleod G., Donachy P., Harmer T.J., Perrot R. H., Conlon B., Press J., Lungu F., “Implied Volatility Grid: Grid Based Integration to Provide On Demand Financial Risk Analysis”, Belfast e-Science Centre, Queen’s University of Belfast, 2005. • Donachy P., Stødle D., “Risk Grid - Grid Based Integration of Real-Time Value-at-Risk (VaR) Services”, EPSRC UK e-Science All Hands Meeting, 2003. • Germano G., Engel M., “City@home: Monte Carlo derivative pricing distributed on networked computers”, Grid Technology for Financial Modelling and Simulation, 2006. • Schumacher J., Jaekel U., and Zimmermann F., “Grid Services for Derivatives Pricing”, Grid Technology for Financial Modelling and Simulation, 2006. • Computational economics: • Gray, J. (2003): Distributed Computing Economics. Microsoft Research Technical Report: MSRTR-2003-24 (also presented in Microsoft VC Summit 2004, Silicon Valey, April 2004) • Chetty, M. and Buyya., R. (2002). Weaving electrical and computational grids: How analogous are they? Computing in Science and Engineering, to appear, May/June 2002. • Kenyon, C. and Cheliotis, G. (2002). Architecture requirements for commercializing grid resources. In 11th IEEE International Symposium on High Performance Distributed Computing (HPDC'02). • Kenyon, C. and Cheliotis, G. (2003), Grid Resource Commercialization: Economic Engineering and Delivery Scenarios. Grid Resource Management: State of the Art and Research Issues. • Kerstin, V., Karim, D., Iain, G. and James, P. (2007), AssessGrid, Economic Issues Underlying Risk Awareness in Grids, LNCS, Springer Berlin / Heidelberg • Birkenheuer, G., Hovestadt, M., Voss, K., Kao, O., Djemame, K., Gourlay, I., Padgett,J.: Introducing Risk Management into the Grid. Proc. 2nd IEEE Intl. Conf. on e-Science and Grid Computing, Amsterdam, The Netherlands (2006)

  19. Comparison 20

  20. Summary • Part I • Service Level Agreement • SLA definition • SLA type • Non-Negotiable: AWS, GAE Examples • Negotiable: job-specific, task oriented • Simple service brokerage use case • Aim: to build job-specific comparison service for computational market • SLA frameworks • Proposed SLA structure: based on WS-agreement standard • Service level characteristics • Availability, performance, autonomic, security • Potentials in computational market • Motivations and literatures

  21. Outline • Part II • Analogy: Financial market • Financial market vs. computational market • Financial risk management, portfolio theory • Value-at-Risk (option free portfolio) • Credit Risk • CDS, CDO • Default probability (Moody’s KMV) • Asset market value and volatility • Distance of Default • Probability of Default • Constructing Job-specific SLA • Building probability of failure • Building probability of completion • Building job-specific charges • Managing multiple Job-specific SLAs (providers) • Conclusion and Future Work

  22. Thank you for your attention Questions

  23. Computational Economics and Job-Specific Service Level Agreements Bin Li. and Dr. Lee Gillam. Department of Computing, FEPS

  24. Objective • Aim: provide the same kind of comparison service for compute resources (Cloud service). • Goods: compute service which is job-specific . • Retailers: resource providers (Amazon, Rackspace, Microsoft, Google). • Invoice: SLA. • Other factors: • Availability (risk or availability confidence), • Insurance, • Price • Penalty • etc. • Key: machine readable (automation, autonomic and efficiency) 25

  25. SLA Frameworks &WS-Agreement Structure • TWO FRAMEWORKS: • Web Service Agreement (WS-Agreement. OGF): GRAAP, part of Service-Oriented Architecture (SOA), XML syntax, Machine readable. • Web Service Level Agreement (WSLA): IBM • Cloud computing use case group • SDTs: identify the work to be done • the required platform; • the software involved; • the set of expected arguments; • input/output resources; • etc. • GTs: provide assurance between provider and requester on quality of service (QoS) • price of the service; • insurance price; • the probability of failure; • the penalty for failure; • the starting time • the probability of completion; • etc. Structure Xml example

  26. Cloud Service Levels (Characteristics) • Availability: (how often the service can be accessed over a time horizon) • Numbers of “Nines”: • 99.95% availability = outrage 4.3 hours/year • Who should define unavailability? • EC2: “Unavailable” means that all of your running instances have no external connectivity during a five minute period and you are unable to launch replacement instances. • Job-specific: future resource availability • Reliability: how well consumer trust the provider • Related to availability, but slightly different; consumer opinion • Combine cloud offerings: great power and flexibilities but less reliability • Confidence Level: How confidence the provider itself with its availability “nines”? • Job-specific: probability of (job) completion • Performance: • Throughput: how quick the service respond; • Load balancing: how the overload is avoid; • Elasticity: ability of growing infinitely with limitations; • Linearity: the system performance as workload increases; • Agility: how quick when respond to scaling up or down; • Data durability: the likelihood of data loss; • etc. • Autonomic: monitoring, automation and dynamic, machine readable. • Security: privacy, data encryption, legal issues

  27. Outline • Part II • Analogy: Financial market • Financial market vs. computational market • Financial risk management, portfolio diversification • Value-at-Risk (option free portfolio) • Credit Risk • CDS, CDO • Credit rating: default probability (Moody’s KMV) • Asset market value and volatility • Distance of Default • Probability of Default • Constructing Job-specific SLA • Building probability of failure • Building probability of completion • Building job-specific pricing • Managing multiple Job-specific SLAs (providers) • Conclusion

  28. Grid, Utility, Cloud…… Computing Potential computational market Computational Market Economics Issues ......................................absent: Pricing, Liability, etc. Service Level Agreements Firms PoD Risk Assessment Resource Monitoring Time series Analysis .............. ..................... .................. .............. ................................. ............................... ........Analysis Analogy ........ ....... ........................... Derivatives Risk Ana Resource PoF Financial Derivatives Financial Risk Management Measures Financial Market

  29. Grid for Financial Risk Analysis • Risk Fact: • Risk is an integral part of the real world in general, and the financial world in particular. • Market • Grid infrastructures in Bank of America and HSBC: 3000 to 6000 processors • Computational services market: Customers willing to pay for use of computer systems instead of purchasing and maintaining hardware and software. • Grid / Cloud: HP, Amazon, Sun, IBM etc. • Financial Risk Management: • Monitory based, losses or profits. • Risk can only be reduced (Mitigated) but never eliminated. • Fundamental risk management theory: Portfolio (diversification). • To ensure market event has reduced impact on the whole portfolio • Depends on the correlation or covariance of the return and other assets. • Diversified portfolio: standard deviation of each asset; correlation among assets • Useful analysis measurements (models): Mean-Variance; Correlation; The sensitivities (The Greeks); Value-at-Risk

  30. Value-at-Risk (VaR) • Defined by Philippe Jorion, Value at Risk theory “summarizes the worst maximum potential loss in value of a portfolio of financial instruments over a certain target horizon with a given level of confidence”. • 3 Components: • Confidence Level (Quantiles), • Holding Period (Time Horizon) • Monetary Base.

  31. Value-at-Risk (VaR)

  32. Value-at-Risk (VaR) Monte Carlo Simulation using Condor DAG Methods Comparison

  33. VaR Monte Carlo Simulation Evaluation Single Financial Instrument MSC Speedup Option-free Financial Portfolio MSC Speedup

  34. Credit Risk • Associated with the risk that a reference entity or an obligor who fails to meet its repayment in due time. • Repayment: principles and debts • The credit risk = firm default risk • There are two main determinants of credit risk: • Loss Given Default (LGD). • (Distance to Default) or Probability of Default(PD), that is, the probability that the debtor does not pay. • accounting-based models • market-based models (Moody’s KMV)

  35. Moody’sKMV-Merton

  36. Moody’sKMV-Merton Cont.

  37. Moody’sKMV-Merton Cont.

  38. Moody’s KMV Value Distribution of asset value at horizon Possible asset value path Asset Volatility Asset Value Distance-to-Default Default Point EDF Time 1 Yr Today

  39. Some results of company PD

  40. The Bridge Service-based Financial Grids Complex financial products and markets compute Resources Risk-balanced portfolio Computational Economics Risk analysis provide construct Develop possible formulation

  41. The Bridge • Grid based financial risk analysis applications (Financial Grids): • Great demands on available resources; • Assume availability at any given time. • Aim: • Ability to predict (risks of resource availability for) the predictability(risks on historical use portfolio). • Major impetus for work-Uncertainty: availability of computation Resource -Predict future resource availability: computation Resource Monitoring

  42. Building probability of failure • Closest work: Kerstin et al: risk-aware Grid architecture. • Kerstin, V., Karim, D., Iain, G. and James, P., “AssessGrid, Economic Issues Underlying Risk Awareness in Grids”, LNCS, Springer Berlin / Heidelberg, 2007 • Specific financial analysis for creating computation economy over queuing-based systems. • Computation Economy as a commodity market; • Due considerations: • 1. For trading and hedging of risk, options, futures and structured products. • 2. Collecting data: historical computation resource use -> predict future resource use for such class of applicatioons. • 3. Construction of portfolios of computer resources (Extension of financial models (CDOs) offers potential for a future market in computation economics) . • Diversify the risk (resource probability of failure) within the overall portfolio.

  43. Predict Future Resource Availability • Grid Resource Historical Usage Analyzing: • Data source: UK’s National Grid Service (NGS) • Monitoring system: Ganglia • Grid middleware: Globus • Data dimensions: 37 system metrics in XML, including use of network bandwidth, temperature and CPU use • Minimum capture interval: 15 seconds • Measurements: • Distribution analysis • Skewness, Kurtosis analysis • Prediction: • Simulation under Normal distribution assumption • Simulation under Laplace distribution assumption CPU usage (Real Time, year data) CPU usage (Changes, year data) CPU usage (Changes, MC simulated, normal)

  44. Building job-specific charges Price Comparison Service: Ami: computation resource price benchmark. Amazon Web Service: success Cloud business model; computation resource cost in real market.

  45. Price benchmark • Some REAL Reliability: Of 64 instances in 10 experiments, only 7 completed (1 failing node in other 3)

  46. Building probabilityof completion Foster’s Hypothesis

  47. 234s 106s 76s

  48. Performance • Is a Cloud better than a Supercomputer? • Grid/HPC: shorter application runtime and less distributions • Cloud: longer application runtime and larger distributions • ready and relatively easy to use.

  49. Managing multiple SLAs • Future commercialized computational market: multiple providers (SLAs) • Collateralized Debt Obligations (CDOs) • Structured transaction • Generic CDO: • Special Purpose Vehicle (SPV) • Underlying assets • Collateral Management • Tranche Management • Risk-identified chunks: Tranches (in the order that secured to be get paid. Eg. AAA; AA; BBB; BB and equity) • Premium: basis points for each tranche Financial CDO CDO Components

More Related