1 / 42

Forecasting with Cyber-physical Interactions in Data Centers

Forecasting with Cyber-physical Interactions in Data Centers. Lei Li leili@cs.cmu.edu. Outline. Overview of time series mining Time series examples What problems do we solve Motivation Experimental setup ThermoCast : the forecasting model Results Other time series models and algorithms.

bozica
Download Presentation

Forecasting with Cyber-physical Interactions in Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Forecasting with Cyber-physical Interactions in Data Centers Lei Li leili@cs.cmu.edu PDL Seminar

  2. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  3. What is co-evolving time series? Correlated multidimensional time sequences with joint temporal dynamics (c) Lei Li 2012

  4. [Li et al 2008a] Motion Capture • Goal: generate natural human motion • Game ($57B) • Movie industry • Challenge: • Missing values • “naturalness” Right hand walking motion Left hand (c) Lei Li 2012

  5. Environmental Monitoring • Problem: early detection of leakage & pollution • Challenge: noise & large data Chlorine level in drinking water systems [Li et al 2009] (c) Lei Li 2012

  6. Network Security • Challenge: Anomaly detection in computer network & online activity BGP # updates on backbone from http://datapository.net/ Webclick for TV Webclick for news from NTT (c) Lei Li 2012

  7. Time Series Mining Problems • Forecasting • Imputation (missing values) • Compression • Segmentation, change/anomaly detection • Clustering • Similarity queries • Scalable/Parallel/Distributed algorithms See my thesis for algorithms covering these problems (c) Lei Li 2012

  8. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  9. Datacenter Monitoring & Management Temperature in datacenter • Goal: save energy in data centers • US alone, $7.4B power consumption (2011) • Challenge: • Huge data (1TB per day) • Complex cyber physical systems (c) Lei Li 2012

  10. Google data center Typical Data Center Energy Consumption • LBL data center [Barroso 09] (c) Lei Li 2012 [LBNL/PUB-945]

  11. Towards Thermal Aware DC Management • Data centers are often over provisioned, with ≈40% of energy spent for cooling (total=$7.4B) • How can we improve energy efficiency in modern multi-MegaWatt data centers? JHU data center with Genomote (c) Lei Li 2012

  12. Air cycle in DC (c) Lei Li 2012

  13. Possible Ways for Saving Cooling and Computing Cost • Challenges: • airflow interaction, spatial placement, SLA, … • Possible direction: • Shutdown unused machine according to workload (c) Lei Li 2012 Example MSN workload

  14. Towards Data Driven AC control and server management • Reactive energy saving: • slow down cooling fan in CRAC • raise AC temperature set points • Proactive data center management: • predicting temperature distribution and thermal aware placement of workload supply air temperature < threshold max(active inlet air temperature)< threshold (c) Lei Li 2012

  15. Big Picture: Predictive AC Control and Server Management Server/workload management Computing energy model Sensor measuring Temperature prediction Cooling energy model CRAC control (c) Lei Li 2012

  16. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  17. Experimental setup • Tested in JHU data center with 171 1U servers, instrumented with a network of 80 sensors (c) Lei Li 2012

  18. Sample measurements (c) Lei Li 2012

  19. Observations • Temperature difference cycle (max/min temp. on the same rack) is in anti-phase with air velocity cycle. • Middle and bottom sections are coldest; Top is hottest • Shutting down under-utilized servers could reduce energy consumption. (c) Lei Li 2012

  20. What happens when shutting down servers? Shut down (c) Lei Li 2012

  21. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  22. ThermoCast[Li et al, KDD 2011] • Given: intake temperatures, outtake temperatures, workload for each server , and floor air speed • Goal: forecasting temperature distribution and thermal aware placement of workload • Approach: a zonal forecasting model • divide the machine room into zones, and each rack into sections. (c) Lei Li 2012

  23. Assumptions • A0: incompressible air • A1: environmental temperature is constant • A2: supply air temperature is constant within a period • A3: constant server fan speed • A4: vertical air flow at the outtake is negligible • A5: vertical air flow at the intake is linear to height (c) Lei Li 2012

  24. Sensor measurements & Air interactions (c) Lei Li 2012

  25. ThermoCast (c) Lei Li 2012

  26. ThermoCast Model outlet temp Inlet temp floor air speed Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011] (c) Lei Li 2012

  27. Parameter Learning (c) Lei Li 2012 s.t.

  28. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  29. ThermoCast Results • Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better using 90 minutes as training, predicting 5 minutes away (c) Lei Li 2012 75% 100% shutdown AR ThermoCast

  30. ThermoCast Results • Q2: How long ahead can ThermoCast forecast thermal alarms? 2x faster FAR=false alarm rate MAT=mean look-ahead time (c) Lei Li 2012

  31. Implication on Capacity Gain • Preliminary results comparing workload placement strategies: • 5 minutes forecast length • With the same cooling: • Inlet temp with ThermoCast: 13.75 C • Inlet temp with Static profiling: 16.5 C • Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling (c) Lei Li 2012

  32. Contributions and Impact • Predictability: a hybrid approach to integrate the thermodynamics and sensor data • Scalable learning/training thanks to the zonal thermal model • Real data and instrument in a data center with practical workload • Projected impact: can handle extra26% workload (e.g. PUE 1.5  PUE 1.4) (c) Lei Li 2012

  33. Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012

  34. DynaMMo: imputation/forecasting sensor 1 sensor 2 … sensorm Time blackout Goal: recover the missing values Details in [Li et al, KDD 2009] (c) Lei Li 2012

  35. DynaMMo result Reconstruction error Spline MSVD [Srebro’03] Linear Interpolation Our DynaMMo better Ideal Dataset: CMU Mocap #16 mocap.cs.cmu.edu Average missing length harder (c) Lei Li 2012 more results in [Li et al, KDD 2009]

  36. PLiF and CLDS for clustering BGP data: hierarchical clustering + PLiF features Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011] (c) Lei Li 2012

  37. CLDS Clustering Mocap Data CLDS two features PCA top 2 components Accuracy = 93.9% Accuracy = 51.0% (c) Lei Li 2012 walking motion running motion

  38. WindMine • Goal: find patterns and anomalies from user-click streams (c) Lei Li 2012

  39. Discoveries by WindMine Job website weather kids health (c) Lei Li 2012

  40. Conclusion • time series mining with many applications • Numbers for energy consumption in DC, and cooling costs much • Sensor networks find use in data center monitoring • ThermoCast: the forecasting model • Other time series models and algorithms • DynaMMo for imputation • PLiF & CLDS for clustering • WindMine for web clicks

  41. References • Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model for Data Centers KDD 2011 • Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML 2011 • Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective Mining of Web-click Sequences, SDM, 2011. • Lei Li, et al. Parsimonious Linear Fingerprinting for Time Series. VLDB 2010. • Lei Li, et al. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD 2009. (c) Lei Li 2012

  42. Thanks! contact: Lei Li (leili@cs.cmu.edu) papers, software, datasets on http://www.cs.cmu.edu/~leili

More Related