Welcome to Winter 2010 RAD Lab Retreat - PowerPoint PPT Presentation

Welcome to winter 2010 rad lab retreat l.jpg
Download
1 / 32

Welcome to Winter 2010 RAD Lab Retreat. Armando Fox. Welcome. Introductions Progress in last 6 months Preview of project-end demo (Jan. 2011) Preview of retreat demos Breakout topics (at dinner) Retreat logistics. RAD Lab 5-year Mission (unchanged since 2006, except blue text).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Welcome to Winter 2010 RAD Lab Retreat

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Welcome to winter 2010 rad lab retreat l.jpg

Welcome to Winter 2010RAD Lab Retreat

Armando Fox


Welcome l.jpg

Welcome

  • Introductions

  • Progress in last 6 months

  • Preview of project-end demo (Jan. 2011)

  • Preview of retreat demos

  • Breakout topics (at dinner)

  • Retreat logistics


Rad lab 5 year mission unchanged since 2006 except blue text l.jpg

RAD Lab 5-year Mission(unchanged since 2006, except blue text)

Enable 1 person to develop, deploy, operate next -generation Internet application

  • Key enabling technology: Statistical machine learning

    • debugging, monitoring, power management, auto-configuration, performance prediction, ...

  • Highly interdisciplinary faculty & students

    • PI’s: Patterson/Fox/Katz (systems/networks), Jordan (machine learning), Stoica (networks & P2P), Joseph (security), Shenker (networks), Franklin (DB)

    • 2 postdocs, ~30 PhD students, ~6 undergrads

  • Teaching integrated with research

    • Grad project courses: cloud computing; SaaS

    • Lower division ugrad course: intro to Web 2.0 app development

    • Upper division ugrad course: SaaS development & operations


Rad lab support l.jpg

RAD Lab Support

?


Slide5 l.jpg

RAD Lab Prototype v2.0

Drivers

Drivers

Drivers

WebApp

PIQL

“I” in PIQL

SCADS

NS1

SCADS

Dir

Dir

Chukwa & XTrace (monitoring)

Dir

Dir

NS2

New apps, equipment, global policies (eg SLA)

Chukwa trace coll.

local OS functions

Offered load, resource utilization, etc.

Director

NS3

SLAs, policies

NEXUS

Training data

Web 2.0 apps

SPARK,

SEJITS

Ruby on

Rails environment

web svc

APIs

performance & cost

models

Log

Mining

AutomaticWorkload

Evaluation (AWE)

Hadoop + HDFS

MPI

KCCA-based

M/R scheduling

Hadoop + HDFS

Chukwa trace coll.

Hadoop + HDFS

local OS functions

VM monitor


Slide6 l.jpg

RAD Lab Prototype:System Architecture

WebApp

PIQL

Xtrace + Chukwa (monitoring)

WebApp

PIQL

WebApp

PIQL

“I” in PIQL

Dir

Dir

Dir

Dir

SCADS

NS1

SLA,

policies

Batch/Analytics

NS2

NEXUS

NS3

SPARK,

SEJITS

log

mining

Hadoop + HDFS

MPI

KCCA-based

M/R scheduling

Hadoop + HDFS

Hadoop + HDFS


Impact of above the clouds l.jpg

Impact of Above the Clouds

  • > 30K 54K downloads; ~6K IBM, MS, Cisco

    • “Circulated to CxOs” of major IT firms

    • IBM: “profound effect” on datacenter strategy

    • Short version to appear in March 2010 CACM

  • Cited by >70 papers, including MS, NIST

  • 20K+ visits to blog (50% USA, ~5% each Japan/UK/India), 700+ RSS followers

    • Ongoing dialogue with readers (~1 post/mo.)

    • Linked from >60 blogs/feeds

  • Many requests for permission to reprint, translate, include in books


Invited appearances talks l.jpg

Invited appearances/talks

  • Internal conferences: Fujitsu, SAP, Google, Univ. of California CC Task Force

  • Conference appearances

    • CC panels at ISCA 2009, VMware Acad. Summit @ SOSP 2009

    • Invited talks: LISA 2009, IEEE SASO (Self-adaptive, Self-organizing Systems)

    • High Performance Computing & Infrastructure

    • World Economic Forum CC panel

    • “Energy efficiency & CC” @ MS Faculty Summit

  • Nodalities podcast series


What s new l.jpg

What’s New


What s new scads l.jpg

What’s New: SCADS

  • PIQL: Performance Introspective Query Language for SCADS

    • Enforce performance safety

    • Generate query plans from primitives (get, get_range, put) and indices

  • Characterizing & synthesizing workload spikes and data hotspots

  • Director controlling data movement, replication during changing workload


What s new workload analysis generation l.jpg

What’s New: Workload analysis & generation

  • Chukwa (log collection) integrated with all pieces

  • Online console log mining to find operational problems

    • How can we improve console logs?

  • KCCA-driven scheduling for MapReduce analytics

  • Introspecting the performance of SCADS (PIQL) queries


What s new infrastructure l.jpg

What’s New: Infrastructure

  • Nexus, a substrate for cloud computing

    • Simultaneously share/schedule resources across interactive and batch

    • Spark, a Scala-based library for machine learning on cloud computing

  • Datacenter-in-a-box using RAMP

    • Very close to emulating 10,000-server/1,000-switch system running real SW

    • ½ rack FPGA boards (1 board ~ 1 container)

    • slowdown ~2 orders of magnitude


What s new labs projects l.jpg

What’s New: Labs/Projects

  • Tonight: AMP Lab: Algorithms, Machines, People (Mike Franklin)

    • current analytics scale poorly despite cloud computing & advances in SML

    • new ways to gather info: crowds/social networks, simulators, etc.

    • combine RAD Lab expertise in DB, SML, Cloud to create “cyberspace exploratorium” for large scale analytics

  • Tomorrow: Berkeley Wireless Research Center: large-scale-app driven wireless & chips (Jan Rabaey)

  • Wednesday: LoCal: computer scientists look at energy (Randy Katz)


What s new publications l.jpg

What’s new: Publications

  • 23 new publications

  • 15 with Affiliate co-authors

  • ~10 in first-tier conferences/journals in systems & machine learning

  • All available from RAD Lab website

  • Details in Backup Slides


What s new students l.jpg

What’s new: Students

  • Dr. Archana Ganapathi (PhinisheD, on market)

  • Dr. Arsalan Tavakoli (at McKinsey & Co.)

  • Dr. Dilip Anthony Joseph (at Conviva)

  • New undergrads to help write apps!

    • Allen Chen

    • Amber Feng

    • Karl He

    • Sunil Pedapudi

    • Marcelo Velloso


Engagement with affiliates l.jpg

Engagement with Affiliates


Presentations outreach l.jpg

Presentations & outreach

  • RAD Lab PI’s at affiliate outreach events

    • Google Faculty Summit (Fox, Katz, Stoica, Jordan, Franklin)

    • Microsoft Faculty Summit (Patterson)

    • Microsoft ALT-TAB (Patterson)

    • Sun TAB (Patterson)

    • VMware GoVirtual spotlight (Fox)

  • RAD Lab faculty invited presentations

    • Fujitsu America: Cloud computing (Fox)

    • OpenCirrus summit: Cloud futures (Fox)

    • Sun Labs: LoCal (Katz)

    • VMware Academic Summit panel at SOSP 2009: Cloud computing & virtualization (Fox)

    • LISA 2009: Cloud computing (Fox)


Students working with industry collaborators l.jpg

Students working with industry collaborators

  • Student research visits to affiliates

    • Nexus: Andy Konwinski, Ben Hindman, Matei Zaharia, Ion Stoica, Scott Shenker (Cloudera, Yahoo!, Facebook)

    • Console log mining: Wei Xu (HP Labs, Google)

    • Microsoft site visit (many students & PI’s, ~20 MSR)

    • Google onsite review (~10 Googlers)

  • Students interning/collaborating with affiliates

    • Gunho Lee (HP Labs, with Partha Ranganathan)

    • Wei Xu (Google; applying console log mining techniques to in-house data)

    • John Duchi (Google; 4 publications with Yoram Singer et al. on online learning & large scale optimization)

    • Ganesh Ananthanarayanan (MSR): improving performance of MapReduce/Dryad jobs


Affiliates engagement l.jpg

Affiliates engagement

  • Research visits to RAD Lab by Affiliates

    • Alice Zheng, Dushyanth Narayanan (MSR)

    • Jesus Molina (Fujitsu America)

    • Devendra Jaisinghani (eBay)

    • Yoram Singer (Google)

    • Frank Steinhans (SAP)

    • Greg Papadopoulos & Dave Douglas (Sun)

  • Research visits to Affiliates by RAD Lab PI’s

    • Amazon: James Hamilton (Shenker)

    • HP Labs: Prith Banerjee (Katz, Fox)

    • MSR (Stoica, Shenker)

    • Yahoo!: Eric14, Owen O’Malley, Surendra Reddy (Shenker)


Demo plans l.jpg

Demo Plans


Final demo january 2011 l.jpg

Final demo (January 2011)

Enable 1 person to develop, deploy, operate next -generation Internet application at scale

  • 3 SCADS-backed Web apps written by undergrads

  • 3 analytics jobs using Spark/SEJITS

  • Running on >=1000 cloud computing nodes

  • Managed by Nexus

  • Director scales SCADS storage up/down & replicates

    • One or more workload spikes/data hotspots

    • while underlying hardware fails and software crashes

  • App-driven decisions: relaxed consistency, log mining


Demoable now l.jpg

Demoable now

Enable 1 person to develop, deploy, operate next -generation Internet application at scale

  • 1 SCADS-backed Web app (SCADr) written by grad

  • 3 analytics jobs using Spark/SEJITS

  • Running on >=1000 cloud computing nodes

  • Not Managed by Nexus

  • Director scales SCADS storage up/down & replicates

    • One or more workload spikes/data hotspots

    • while underlying hardware fails and software crashes

  • App-driven decisions: relaxed consistency, log mining


Breakout topics l.jpg

Breakout Topics


Breakout topics leaders l.jpg

Breakout topics & leaders

  • AMP Lab plans (Algorithms, Machines & People) [Mike Franklin, Mike Jordan]

  • Datacenter storage: RAM, flash, disk? [Dave Patterson]

  • SCADS—what’s next? [Michael Armbrust, Beth Trushkowsky]

  • Cloud Programming Beyond MapReduce [Matei Zaharia, Armando Fox]

  • What are logs good for? [Wei Xu]

  • Workload spike modeling [Peter Bodik]

  • What’s (technically) new about cloud security? [Anthony Joseph, Yanpei Chen]

  • Datacenter energy efficiency: “race to sleep”? [Randy Katz]


Logistics l.jpg

Logistics


Logistics26 l.jpg

Logistics

  • Wifi: local access only during sessions

  • Check-in; what’s covered

  • Next break: get keys from Kattt or Sean (NOT check-in desk)

  • Skiing tomorrow

    • Transportation, lift tickets on us

    • Rentals, lessons at your own expense

    • Show up to morning sessions in ski wear

    • Bag lunches will be available as you leave


Backup slides including publications details l.jpg

BACKUP SLIDESincluding publications details


Progress publications l.jpg

Progress: Publications

  • Efficient Online and Batch Learning with Forward Backward Splitting. John Duchi and Yoram Singer (Google). Journal of Machine Learning Research (JMLR), vol. 11, 2010.

  • Online and Batch Learning with Forward Backward Splitting. John Duchi and Yoram Singer. Neural Information Processing Systems (NIPS) 2009.

  • Oral Presentation.Boosting with Structural Sparsity. John Duchi and Yoram Singer. International Conference on Machine Learning (ICML) 2009.

  • Understanding TCP Incast Throughput Collapse in Datacenter Networks. Yanpei Chen, Rean Griffith et al. Proceedings of the 1st ACM Workshop on Research on Enterprise Networking (WREN 2009). August 2009.


Progress publications 2 l.jpg

Progress: publications (2)

  • Statistics-Driven Workload Modeling for the Cloud. Archana Ganapathi, Yanpei Chen et al. Accepted to Workshop on Self-Managing Database Systems (SMDB) 2010.

  • The nested Chinese restaurant process and Bayesian inference of topic hierarchies. Blei, D., Griffiths, T., and Jordan, M. I. Journal of the ACM. (to appear).

  • Estimating divergence functionals and the likelihood ratio by convex risk minimization. Nguyen, X., Wainwright, M., and Jordan, M. I. IEEE Transactions on Information Theory. (to appear).

  • Joint covariate selection and joint subspace selection for multiple classification problems. Obozinski, G., Taskar, B. and Jordan, M. I. Statistics and Computing. (to appear).

  • Support union recovery in high-dimensional multivariate regression. Obozinski, G., Wainwright, M. and Jordan, M. I. Annals of Statistics. (to appear).

  • Nonparametric latent feature models for link prediction. Miller, K., Griffiths, T., and Jordan, M. I. Advances in Neural Information Processing (NIPS) 22, (2010).


Publications 3 l.jpg

Publications (3)

  • Fast approximate spectral clustering. Yan, D., Huang, L. (Intel), and Jordan, M. I. 15th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD), Paris, France. (2009).

  • On surrogate loss functions and f-divergences. Nguyen, X., Wainwright, M., and Jordan, M. I. Annals of Statistics, 37, 876-904. (2009).

  • Kernel dimension reduction in regression. Fukumizu, K., Bach, F. R., and Jordan, M. I. Annals of Statistics, 37, 1871-1905. (2009).

  • Hierarchical Bayesian nonparametric models with applications. Teh, Y. W. and Jordan, M. I. In Bayesian Nonparametrics: Principles and Practice, Cambridge, UK: Cambridge University Press. (2009).

  • Large-Scale System Problem Detection by Mining Console Logs. W. Xu, L. Huang, A. Fox, D. Patterson, M. Jordan. Proc. SOSP 2009.

  • Output-Deterministic Replay for Multiprocessor Programs. G. Altekar, I. Stoica. Proc. SOSP 2009.

  • W. Xu, L. Huang, A. Fox, D. Patterson, M. Jordan. Online Problem Detection by Mining Console Logs. Proc. ICDM 2009.


Publications 4 l.jpg

Publications (4)

  • P. Bodik, M. Goldszmidt (MSR), A. Fox, H. Andersen (Microsoft), Dawn Woodard. Fingerprinting the Datacenter: Automated Classification of Performance Crises. Proc. EuroSys 2010 (to appear)

  • A Common Substrate for Cluster Computing.  B. Hindman, A. Konwinski, M. Zaharia and I. Stoica.  HotCloud 2009, June 2009.

  • Macroscope: End-Point Approach to Networked Application Dependency Discovery, Lucian Popa, Byung-Gon Chun (Intel), Ion Stoica, Jaideep Chandrashekar (Intel), Nina Taft (Intel), in proceedings of the 5th ACM International Conference on emerging Networking EXperiments and Technologies (CoNEXT 2009), December 2009

  • Rule-based Forwarding (RBF): improving the Internet’s flexibility and security, Lucian Popa, Ion Stoica, Sylvia Ratnasamy (Intel), in proceedings of the Eighth ACM Workshop on Hot Topics in Networks (HotNets 2009), October 2009

  • DryadInc: Reusing work in large-scale computations, Lucian Popa, Mihai Budiu (MSR), Yuan Yu (MSR), Michael Isard (MSR), in proceedings of the first USENIX workshop on Hot Topics in Cloud Computing (HotCloud 2009), June 2009

  • An Energy Case for Hybrid Datacenters. Byung-Gon Chun (Intel), Gianluca Iannaccone (Intel), Giuseppe Iannaccone, Randy Katz, Gunho Lee, Luca Niccolini. HotPower09, Oct 2009


Slide32 l.jpg

Web 2.0 apps

Analy-

tics

Other cloud apps

SCADS

RoR + PIQL

Hadoop

SEJITS

Spark

Chukwa / Nexus

Cloud API’s (Eucalyptus)


  • Login