Who Will Write Those Industry Production Papers?

Who Will Write Those Industry Production Papers? Indranil (Indy) Gupta Professor, CS @ UIUC http://dprg.cs.uiuc.edu/ :: http://indy.cs.illinois.edu Cloud Research, June 2019 Sandhamn island, Sweden

An unconventional talk (for me, and for this workshop) • Based on experiences and many conversations • Aimed at academia audience (but industry colleagues might identify)

Who Will Write Those Industry Production Papers? Indranil (Indy) Gupta Professor, CS @ UIUC http://dprg.cs.uiuc.edu/ :: http://indy.cs.illinois.edu Cloud Research, June 2019 Sandhamn island, Sweden

Production Systems • Not “Research” Systems • Core underlying systems used for managing cluster and clouds • Systems that actually run inside a company’s clusters, public clouds, etc., on a daily basis, and have real users, indirect or direct. • E.g.: You use Gmail, Google Cloud, Facebook. Think of the entire cloud/cluster systems stack running underneath these. • E.g.: Engineers inside these companies use various cloud/cluster systems internally.

Production System Papers Published by Companies • Google [https://ai.google/research/pubs/] • Early Days (early 2000s): Mapreduce, GFS, Chubby,… • Mid-Days (late 2000s): Spanner, … • Recently: TensorFlow, Studies, … • Facebook [https://research.fb.com/publications/] • Early Days (early 2010s): Haystack, Hive, … • Mid-Days (2010s): TAO, Wormhole, … • Recently: Presto, PyTorch, Studies, … • Similar examples exist from other companies…

So What’s the Problem? Published systems Unpublished systems: Running/Ran/Could’ve Run /Tried To Run

Why Publish These at all? • Reduces differential between industry research and academic research • Keeps academic research alive and relevant • In the 2000s, many review comments: “Cool idea! But it’s been done (sshh… can’t tell you more details).” • Google has run Borg cluster scheduler since 2000s. Was published in Eurosys 2015. • Publication incites and excites new open source systems • Mapreduce (Google paper)  Hadoop (Yahoo!, HortonWorks)  Hive • Chubby (Google paper)  Zookeeper (Yahoo!, HortonWorks) • BigTable (Google paper)  HDFS (Yahoo!, HortonWorks) • TensorFlow (Google paper and open-source)  Large Community • Benefits inventing company • Allows academic and open-source community to contribute to company, e.g., TensorFlow, MongoDB, … • Carves out “niche” for company’s ideas to dominate industry • Helps in recruiting employees by showcasing exciting internal culture

Why Is This Not Being Done Already? • The People Who Know the System Best are those who built it and those who run it day to day • (Based on multiple conversations with Googlers, Facebookers, etc.) For such a person, Which of the following is the least exciting task for a day? (their typical priorities) • Write code to earn a million $ for their company • Look through logs of a recent outage to understand interesting causes • Fix said outage/troubleshoot • Have lunch • Take a walk • Spend time with family and friends • Write

Nevertheless… • Companies are publishing • Some incentivize employees to publish ($$$) • Can Academia (and the broader research community) help?

Why Should the Research Community Help? • Aren’t Production Systems Boring? I.e., “only engineering, no research”? • No, many of them have inherent principles that are new, interesting, exciting, and sometimes undiscovered. • Our work with LinkedIn, Microsoft, has shown this (next few slides).

Kakivaya, … Ahsan, … Gupta,.. et al Eurosys 2018 1. Microsoft Service Fabric Paper at: http://dprg.cs.uiuc.edu/ Open-source: https://github.com/Microsoft/service-fabric A distributed platform that enables building and management of scalable and reliablemicroservice basedapplications Culmination of over 15 years of design and development (P2P days 2000s-now) TalkTalk TV Microsoft Intune Azure Cosmos DB Skype And More… Cortana Microsoft IoT Suite BMW • Microsoft Azure SQL DB: • Hosts ~2 Million DBs | Containing 3.5 PB of data | Spans over 100K machines • Azure Cosmos DB: • Utilizes 2 million cores | Spans over 100K machines • Cloud Telemetry Engine: • Processes 3 Trillion events/week

Microsoft Service Fabric – New Principles 2. Distributed Membership Protocol with Time-Bounds. New “Arbitrators” to separate failure decision from failure detection 3. Upper layers use consistent blocks from lower layers. 4. Data structures (for programmers) that are reliable, scalable, and distributed 1. Consistency from the Ground Up. Makes Building Consistent Applications Easier.

Noghabi, … Gupta,.. et al ACM SIGMOD 2016 2. Ambry (LinkedIn) Paper at: http://dprg.cs.uiuc.edu/ Open-source: https://github.com/linkedin/ambry A geographically distributedsystem that stores and retrieves read-heavy immutable objects in an efficientand scalablemanner Stores all media objects on LinkedIn.com and apps In production since 2016 and >500 M users.

Ambry’s Design Elements Huge diversity • Low latency and high throughput • Logical grouping of objects (partition) • Segmented Indexing • Exploiting OS caches • Zero cost failure detection • Chunking and parallel read/write Fast, durable, and highly available processing • Geo-distributed • Asynchronous writes • Proxy request • Active-active design • 2-phase background replication • Journaling 10s of KBs to few GBs Geo-replication with low latency Evergrowing data and requests Load imbalance • Load balance • Chunking and random placement • Seamless Rebalancing mechanism with minimal data movement • Scalable • Decentralized design • Independent components with little interaction • Separation of logical and physical placement • Data rarely gets deletes • > 800 M req/day (~120 TB) • Rate doubled in 12 months …

Noghabi, … Gupta,.. et al VLDB 2017 3. Apache Samza Paper at: http://dprg.cs.uiuc.edu/ Open-source: https://github.com/apache/samza A Battle-Tested and Scalable stream/data processing framework • Top-level Apache project since 2014 • In use at LinkedIn, Uber, Metamarkets, Netflix, Intuit, TripAdvisor, VmWare, Optimizely, Redfin, etc. • Powers hundreds of apps in LinkedIn’s production

Apache Samza: Key Ideas • Scalability • Input partitioning • Parallel and independent tasks • Fast Recovery & Restart • Parallel recovery • Host Affinity • Efficient Stateful Processing • Local state • Incremental checkpointing vs. Full checkpointing tradeoff • Unified Data Processing API • For • Stream and Batch • Stream Processing as a library and Stream Processing as a Service (SPaaS)

“Our” Contributions • Contributing to source code of (a few of these) systems • Understanding these systems • Bringing together people • Evaluation framework, Traces, etc. • Evaluation • Writing

A few Rules of Thumb We Used • Internships by mid to senior PhD students, targeted to a) contributing to the system, b) understanding, and c) evaluating it. • Funding for students to continue work at university after they return from internship • Faculty visits, and Sabbaticals • Close faculty involvement throughout project and especially in writing of paper • Clear IP rules (and NDAs if needed)

Why Should the Research Community Help? Part 2 • Isn’t this “mercenary”? • No, Computer Science is as much an inventor’s playground as a natural science. • Natural science = understanding natural phenomena, e.g., Biology, Physics, Chemistry • If nature is “what surrounds us, and what we have to deal with on a daily basis”, … • The entire internet, mobile, and IoT ecosystem around us is now a part of nature (“artificial nature” if you will) • Understanding their principles is key in bettering them, and inciting borrowing of ideas into other domains • Poetry surrounds us everywhere, but putting it on paper is, alas, not so easy as looking at it. -- Vincent Van Gogh

One Way for Academics to Accomplish This • There has to be genuine interest from both sides • Talk to your friends and collaborators in industry • Figure out if they have a desire to “open up” up their production systems. They may not have thought about their systems that way! • Evaluate whether there is sufficient innovation and/or new philosophy • Think of not just Running systems, but also systems that Ran/Could’ve Run/Tried To Run • If there is agreement, plan how to accomplish your goal • Be flexible in “scope” of paper • Be ready for last-minute surprises from your industry partners • If the system is not ready, perhaps there is some way you can contribute to improve it? • If your industry partner is not ready, you have planted the seed. Perhaps they will be ready in a few years • Don’t stop talking to them after the paper is published. This can be the start of a relationship • Consider sabbaticals/leaves to spend time at the company.

Students Involved ShadiNoghabi [Ambry, Samza: LinkedIn Systems] (PhD 2018, Now at Microsoft Research, Redmond) Shegufta Ahsan [Service Fabric: Microsoft System] (PhD student, Graduating next year)

One Way for Academics to Accomplish This • There has to be genuine interest from both sides • Talk to your friends and collaborators in industry • Figure out if they have a desire to “open up” up their production systems. They may not have thought about their systems that way! • Evaluate whether there is sufficient innovation and/or new philosophy • Think of not just Running systems, but also systems that Ran/Could’ve Run/Tried To Run • If there is agreement, plan how to accomplish your goal • Be flexible in “scope” of paper • Be ready for last-minute surprises from your industry partners • If the system is not ready, perhaps there is some way you can contribute to improve it? • If your industry partner is not ready, you have planted the seed. Perhaps they will be ready in a few years • Don’t stop talking to them after the paper is published. This can be the start of a relationship • Consider sabbaticals/leaves to spend time at the company.

Poetry surrounds us everywhere, but putting it on paper is, alas, not so easy as looking at it.-- Vincent Van GoghWhat is a scientist after all? It is a curious man looking through a keyhole, the keyhole of nature, trying to know what's going on. -- Jacques Yves Cousteau

Who Will Write Those Industry Production Papers?