1 / 30

Benchmarking Interactive Social Networking Actions

Benchmarking Interactive Social Networking Actions. Shahram Ghandeharizadeh Director of Database Lab Computer Science Department University of Southern California. Outline. Motivation Research questions Survey use cases BG Benchmark FORSEE Future research. Motivation. Data Stores

Download Presentation

Benchmarking Interactive Social Networking Actions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Benchmarking Interactive Social Networking Actions Shahram Ghandeharizadeh Director of Database Lab Computer Science Department University of Southern California

  2. Outline Motivation Research questions Survey use cases BG Benchmark FORSEE Future research

  3. Motivation Data Stores Cloud Services Person-to-person cloud services

  4. Research Questions What is the tradeoff between alternative data models? E.g., Is JSON superior to the relational data model? How do alternative architectures compare with one another? E.g., Is cache augmented SQL as good as a document/extensible store? Do NewSQL data stores scale as well as NoSQL data stores?

  5. Survey Use Case S. Barahmand and S. Ghandeharizadeh. BG: A Benchmark to Evaluate Interactive Social Networking Actions. CIDR ‘13, Asilomar, CA.

  6. Data Model Own Accounts Resources Own News Feed Displays d Pages Members Friend Follow Share Share

  7. BG Architecture Visualization Tool Emulates User Behavior Quick and Efficient Rating Service Level Agreement Scalable S. Barahmand and S. Ghandeharizadeh. Expedited Benchmarking of Social Networking Actions with Agile Data Loading Techniques. CIKM ‘13, SF, CA.

  8. http://bgbenchmark.org

  9. Good Benchmark = FORSEE Focus on an important debate & provide relevant metrics to facilitate progress. One number to describe alternative designs/solution. Runs in a reasonable amount of time. Scalable. Effective abstraction with meaningful requests. Extendible.

  10. Good Benchmark = FORSEE F One number to describe alternative designs/solution. Runs in a reasonable amount of time. Scalable. Effective abstraction with meaningful requests. Extendible. + Unpredictable data

  11. Good Benchmark = FORSEE F O Runs in a reasonable amount of time. Scalable. Effective abstraction with meaningful requests. Extendible. + Unpredictable data SoAR

  12. Good Benchmark = FORSEE F O R Scalable. Effective abstraction with meaningful requests. Extendible. + Unpredictable data SoAR 1 Week to rate = 4 months to rate =

  13. Good Benchmark = FORSEE F O R S Effective abstraction with meaningful requests. Extendible. + Unpredictable data SoAR 1 Week to rate = 4 months to rate =

  14. Good Benchmark = FORSEE F O R S E Extendible. + Unpredictable data SoAR 1 Week to rate = 4 months to rate = Only when two members are NOT friends!

  15. Good Benchmark = FORSEE F O R S E E + Unpredictable data SoAR 1 Week to rate = 4 months to rate = Only when two members are NOT friends! FORSEE = PREDICT

  16. Good Benchmark = FORSEE F O R S E E + Unpredictable data SoAR 1 Week to rate = 4 months to rate = Only when two members are NOT friends! A good benchmark helps settle debates quickly to enable its discipline to make rapid progress.

  17. Future Research: Data Sciences Challenge: Wide variety of science applications with diverse debates. Hypothesis: A benchmark generator. ER diagram Benchmark Generator Application (data science) Specific Benchmark Actions & their dependencies Key Metrics

  18. Future Reseach Evaluate the hypothesis using BG. Extend to other data science applications. Benchmark Generator Unpredictable data

  19. Big Data: Operations Ad-hoc Pre-specified Simple Complex Off-line Interactive

  20. Big Data: Google Analytics • Objective: • Advertising ROI • Frequency of access to pages Ad-hoc Pre-specified Simple Complex Off-line Interactive • Gather click stream data: Optimized for writes, • Compute aggregated data: MapReduce/Hadoop

  21. Big Data: Google Analytics • Objective: • Advertising ROI • Frequency of access to pages Ad-hoc Pre-specified Simple Complex Off-line Interactive • Gather click stream data: Optimized for writes, • Compute aggregated data: MapReduce/Hadoop • Enable users to view aggregated data.

  22. Big Data: Facebook Ad-hoc Pre-specified Simple Complex Off-line Interactive Show profile page of Farah Fawcett Follow Barak Obama Friend Lady Gaga

  23. 3 Vs: Facebook High Volume: 1.2 billion user profiles, 150 billion friend connections, 1.13 trillion likes, 17 billion tagged locations, 240 billion photos, …. High Velocity: 700 million active users daily, 4.5 billion likes daily, 350 million photos uploaded daily, … High Variety: Mix of data types: Structured records, multimedia content, text. Source: http://expandedramblings.com/index.php/by-the-numbers-17-amazing-facebook-stats/ posted on Oct 6, 2013.

  24. Expertise/Contributions BG Benchmark to evaluate performance of alternative data stores: SQL, NoSQL, NewSQL. http://bgbenchmark.org A high performance CASQL solution that minimizes software development life cycle. KOSAR, a prototype of a CASQL solution. Ad-hoc Pre-specified Simple Complex Off-line Interactive

  25. BG, http://bgbenchmark.org Joint work with Sumita Barahmand Benchmark for interactive social networking actions. Consists of 11 actions:

  26. CASQL Joint work with Jason Yap. Key insight: Query result look up is faster than query processing. Contribution is physical data independence in CASQL systems: Transparent caching Serial schedules Detection of race conditions and prevention of inconsistent states.

  27. KOSAR Joint work with Reihane Boghrati, Lakshmy Mohanan and Neeraj Narang. A software prototype of CASQL Scalable Highly available Elastic Boosts performance of a leading industrial strength RDBMS vendor from 2 actions per second to more than 300,000 actions per second.

  28. BG Coordinator Delta Analyzer Experiment Experiment Load BGClient N BGClient 2 BGClient 1 … Agile Data Loading Techniques Data Store Server …

More Related