Compute Intensive Research on Cloud Computing Infrastructure

Compute Intensive Research on Cloud Computing Infrastructure Systems Research Group Computer Science Department University of Illinois at Urbana-Champaign Roy H. Campbell Reza Farivar, AbhishekVerma, Cristina Abad rhc@illinois.edu {farivar2, verma7, cabad}@illinois.edu

Motivation and Goals • Research teams and practitioners are embracing cloud computing technologies for compute intensive tasks • E.g. Genetic Algorithms, Financial Algorithms, Bioinformatics, Astronomy, Machine Learning, Web Analytics, etc. • Many economic advantages • Not clear if such tasks perform optimally using MapReduce on COTS clusters (specially GPU clusters) • Research Goals: Investigate bottlenecks of COTS + MapReduce + Compute Intensive Tasks

Summary • Financial Computations • Genetic Algorithms for Optimization • Astronomy • Gene Alignment • Partitioned Iterative Algorithms: Best Effort • Clouds, Machine Learning and Reliability • Storage Workload Characterization • Workload Modeling

Financial Computations • Black Scholes future options pricing on a MapReduce cluster • Using MITHRA, our modified “MapReduce on GPU clusters” Middleware • MITHRA runs map() on GPUs as CUDA kernels • reduce() runs on the cluster CPUs • Better use of GPU hardware increased locality exploiting

Genetic Algorithms for Optimization • Initialize population with random individuals. • Evaluate fitness value of individuals. • Repeat steps 4-5 to 2 until some convergence criteria are met. • Select good solutions by using tournament selection without replacement. • Create new individuals by recombining the selected population using uniform crossover. Map Reduce

Astronomy Pre-processing / Metadata generation SExtractor MapReduce Job Post-Processing File Fetch MapReduce Job Merginguses X,Y as key SExtractor • Use Hadoop Streaming to Run multiple, parallel instances of an Astronomy source extraction program: Sextractor • Use MapReduce intermediate key grouping / sorting to help merge catalog records File 1 File 2 File 3 File 4 … SExtractor SExtractor UniqueIDs SExtractor HDFS Mergedcatalog HDFS Individual Catalogs Phase 1 Phase 2 Phase 3 Phase 4

Gene Alignment: Distributed Filtering TGCCTTCATTTCGTTATGTACCCAGTAGTCATAAAAGCACTAGCTTGCCAAGTT Sorted Masked Arrays 1 1 0 1 0 1 0 1 1 TGCCTT CCTT00 CC00CA 00CATT GCCTTC CTTC00 CT00AT 00CCTT CCTTCA GCCT00 GC00TC 00CTTC CTTCAT TGCC00 TG00TT 00TCAT TTCATT TTCA00 TT00TT 00TTCA Distributed pigeon hole filter

Masked Read Matching A Short Read Sorted Masked Arrays CCATCA 1 1 0 1 0 1 0 1 1 CC00CA CC00CA 1 1 0 1 0 1 0 1 1 CT00AT CCTT00 00CATT GC00TC CTTC00 00CCTT TG00TT CCAT00 CC00CA 00ATCA GCCT00 00CTTC TT00TT TGCC00 00TCAT TTCA00 00TTCA

Iterative Computations YoutubeVideo Suggestion BFS PageRank Clustering Pattern Recognition

Partitioned Iterative Convergence: Best Effort Model Update Current Model(s) Shared model management Convergence test Cluster node 1 Local iteration ? Cluster node 2 ? Global Model Merge Input Partitioner New Model Cluster node 3 ? Model effect applicator New sub-model ? Convergence Criteria

Clouds, Machine Learning and Reliability • Trend: Clouds will expand into diverse roles • Big Data  Data mining and machine learning • Real time data  Streaming clouds (e.g. Storm) • Economic pressure: Massive clouds adoption • Results fed into Cyber physical systems • Result: The reliability and security of (1) clouds and (2) ML algorithms on clouds will impact real-world phenomena • The current cloud solutions are orders of magnitude less dependable than minimum requirements for cyber physical systems

Cloud Storage Workload Characterization • Studied how MapReduce interacts with storage layer • Findings relevant to storage system design and tuning: • Workloads are dominated by high file churn • 80%−90% files accessed 1-10 times in 6 months • Small % of very popular files • Young files: • High % of accesses, • Small % of bytes stored • Requests are bursty • Files are very short-lived: • 90% deletions target files < 1.25 hours old 12

Big Data Storage Workloads: Modeling and Synthetic Generation • One potential storage bottleneck: • Metadata server: must handle large number of bursty requests • New schemes have been proposed but evaluation has been insufficient • No adequate traces or models • Mimesis: synthetic workload generator • Suitable for Big Data workloads • Reproduces desired statistical workload of original trace • Accurate: low RMSE (root mean squared error) when used in place of original traces • Used to evaluate a LRU metadata cache for HDFS

Performance Modeling of MapReduce Environments • Performance modeling techniques for MapReduce environments • Analytical models, Simulation, Experimental measurements • Service level objectives: • Automatic Resource Inference and Allocation of resources for MapReduce workloads • Optimization of makespans of set of jobs and DAGs • Comparison of hardware alternatives

Comparison of Hardware Alternatives • Designed a synthetic MapReduce application based on the CPU, memory, disk and network used • Goal: Find a minimum set (basis) of these synthetic applications onto which any MapReduce workload can be projected on to • Using performance of the basis on old and new hardware, estimated performance of any workload on new hardware within 10% error.

Compute Intensive Research on Cloud Computing Infrastructure

Compute Intensive Research on Cloud Computing Infrastructure

Presentation Transcript

Computing on the Cloud

Secure Cloud Computing with Virtualized Network Infrastructure

Data-Intensive Computing

Cloud Technologies for Data Intensive Biomedical Computing

Standardization on Cloud Computing

Research in Cloud Computing

Cloud Technologies for Data Intensive Computing

Cloud Computing: Perspectives for Research

Data-intensive Computing on the Cloud: Concepts, Technologies and Applications

Accelerating Cloud Computing Infrastructure: Cisco Nexus 1000V

Developing Sustainable Infrastructure Plenary on Cloud Computing

Cloud Computing for Geophysics: Virtualization of Infrastructure

Agile Infrastructure IaaS Compute

Models and Frameworks for Data Intensive Cloud Computing

Cloud Computing Infrastructure Security

Computing on the Cloud

Data Intensive Computing

Cloud Computing Security Research

From Compute Intensive to Data Intensive Grid Computing

Research Cyberinfrastructure / Advanced Computing Infrastructure

Microsoft Cloud Computing Research Centre

Update on Computing/Cloud