1 / 17

System Scalability Research

System Scalability Research. Andrew A. Chien William Eckhardt Professor of Computer Science, The University of Chicago Senior Scientist, Argonne National Laboratory IRIS-HEP Kickoff September 7, 2018. Traditional Software Scalability. SW App.

deng
Download Presentation

System Scalability Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. System Scalability Research Andrew A. Chien William Eckhardt Professor of Computer Science, The University of Chicago Senior Scientist, Argonne National Laboratory IRIS-HEP Kickoff September 7, 2018

  2. September 7, 2018 Traditional Software Scalability SW App • Does the application scale to larger data units? • Does the application scale up over large data sets, experiments? Much larger data unit Many instances of app, data SW App SW App SW App SW App SW App SW App SW App SW App SW App SW App SW App . . . SW App SW App SW App SW App SW App Good, but doesn’t take impact on INFRASTRUCTURE into account. How to scale HEP community research experiments bandwidth?

  3. September 7, 2018 System Scalability Research • Database Example • Query optimization: Predicate/Filter push down • Exploit selectivity, increase scalability/performance of system • Hardware Acceleration • Cloud Example • S3 Select • Optimize infrastructure use AND improve application performane • IRIS-HEP Examples • Distributed analysis, filtering • Optimized data movement across wide area, increase scalability/performance of system • Hardware acceleration

  4. Traditional Query Execution Plan SELECTCOUNT(*), Product.color, Age FROMSaleWHEREDNN(Product.comment) AS Score > 0.9, Age != “old”, GROUPBYProduct, Age, ORDERBYProduct.price, Age Timsort filter: Score > 0.9 DNN DNN inference word2vec lookup readP tokenize filter: Age != “old” decompress protobuf decode readA hash aggregate Sale pre-hash aggregate

  5. Query Optimized ATO Plan Timsort Accelerated Transformation Operator (e.g. UDP) hash aggregate Accelerated Inference Operator (e.g. Nvidia DLA ) pre-hash aggregate DNN filter: Score > 0.9 inference readP word2vec lookup tokenize filter: Age != “old” decompress protobuf decode readA Sale

  6. QO ATO Plan + Flexible Encodings filter: Score > 0.9 radix sort Accelerated Transformation Operator (e.g. UDP) Accelerated Inference Operator (e.g. Nvidia DLA ) pack (price,age) hash aggregate CPU Radix Sort Operator DNN Accelerated Scan Operator (e.g. SIMD [bitweaving, SIGMOD’13]) inference pre-hash aggregate word2vec lookup readP tokenize huff-encode: (product,age) dict-encode: age decompress filter: Age != “old” protobuf decode pack: array of age pack (product,age) readA transpose dict-encode: product Sale

  7. QO ATO Plan + Flexible Encodings + Operator Fusion Accelerated Transformation Operator (e.g. UDP) filter: Score > 0.9 radix sort DNN Accelerated Inference Operator (e.g. Nvidia DLA ) inference hash aggregate word2vec lookup CPU Radix Sort Operator pre-hash aggregate Accelerated Scan Operator (e.g. SIMD [bitweaving, SIGMOD’13]) fused fused fused filter: Age != “old” Sale

  8. Example Benefit Overall query benefit can be 10-100x! (looking hard at the data that matters)

  9. September 7, 2018 A Cloud Example: Data Analysis • Iterators over all objects in an S3 bucket • S3 select • Interesting: Pricing and business model (when you own the endpoints and network COST)

  10. September 7, 2018 Hardware Acceleration: Big Wins Xeon E5620 (8-thread, 340mm2, 80W) UDP (64-lane, 8.7mm2, 0.86W) UDP Hardware Implementations

  11. September 7, 2018 CSV Parsing 1 UDP lane is 4x faster than 1 CPU thread UDP is 1000x energy-efficient thanCPU, 64-lane UDP: 12GB/s, 8-thread CPU: 0.4GB/s

  12. September 7, 2018 Snappy Compression UDP is 270x energy-efficient thanCPU, 64(21)-lane UDP: 3.2 GB/s, 8-thread CPU: 1.0GB/s 1 UDP lane matches 1 CPU thread

  13. September 7, 2018 64-lane UDP vs. 8-thread CPU Significant speedup on all ETL workloads, mean speedup >20x

  14. September 7, 2018 What does this mean for IRIS-HEP? • Distributed Data Lake, Shared General Data format (across experiments) • Scalable analysis pulls data from Lake, and ships to computing resources [analysis] • Variety in analysis experiments and data use and availability of compute resources IMPLIES large data movement <rob gardner picture>

  15. September 7, 2018 Example Research Topics • Vertical (distributed) partitioning and filtering • Programmable hardware acceleration [10-100x size reduction] • => Can dramatically increase System scalability and HEP application science capability

  16. September 7, 2018 Discussion

  17. September 7, 2018 backup

More Related