1 / 41

HPC over Cloud

HPC over Cloud . East-West Neo Medicinal u- Lifecare Research Center. Workshop January 2014. Presented By: Muhammad Bilal Amin Cloud Computing Team, Ubiquitous Computing Lab. Kyung Hee University, Global Campus, Korea. Agenda. High Performance Computing over Cloud

karlyn
Download Presentation

HPC over Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HPC over Cloud East-West Neo Medicinal u-Lifecare Research Center Workshop January 2014 Presented By: Muhammad Bilal Amin Cloud Computing Team, Ubiquitous Computing Lab Kyung Hee University, Global Campus, Korea

  2. Agenda • High Performance Computing over Cloud • Motivation for HPC over Cloud (HPCoC) • Related work • HPCoC Architecture • HPCoC Contribution • SPHeRe • Motivation for SPHeRe • Implementation Details • Evaluation • SPHeRe’s Contributions & Achievements • Conclusion

  3. Motivation for HPC over Cloud

  4. Motivation for HPC over Cloud

  5. Motivation for HPC over Cloud

  6. Motivation for HPC over Cloud

  7. Motivation for HPC over Cloud

  8. Related Work and Limitations

  9. HPCoC Architecture (Stack View)

  10. UCLab Cloud infrastructure

  11. UCLab Cloud Infrastructure Physical Machine Physical Machine Physical Machine Physical Machine Physical Machine Physical Machine VM 2 VM 4 VM1 VM1 VM 2 VM 3 Hadoop Hadoop Hadoop Hadoop Windows 7 Windows 7 Guest OS Guest OS Java Runtime Java Runtime Java Runtime Java Runtime 1 Tb 1 Tb Linux Linux Linux Linux 4 Gb 4 Gb 250 Gb 250 Gb 250 Gb 250 Gb 2 1 2 1 4 Gb 4 Gb 4 Gb 4 Gb Virtual Machines Virtual Machines 1 2 3 4 5 6 7 8 Hypervisor VM Ware ESXI Native OS Hypervisor Windows 7 x64 Xen Hard drive Hard drive 2 Tb 2 Tb 4 Gb 8 Gb RAM 4 Gb 16 Gb RAM 4 Gb 4 Gb 4 Gb 4 Gb 8 Core i7 CPU 1 2 3 4 1 2 3 4 5 6 7 8 4 virtual nodes 16 virtual nodes 20 Virtual Nodes

  12. HPCoC Contributions & Uniqueness • A unified Java-based High performance platform for Grande Applications (Data and Computation Intensive). • Cloud-enable Java-based HPC messaging and distribution middle-wares e.g. MPJ-Core. MPI-Like messaging with fault tolerance incorporated from Hadoop. • Implement parallel computation intensive and data intensive processing on unshared data in MapReduce through In-map/In-reduce parallelism. • Green HPC: Virtualized resources are a big step for the HPC to step into green computing and energy efficient. • Releasing the solution under an open source licensing for the academic community.

  13. A Performance Initiative towards Large-scale Bio-medical Ontology Matching by Implementing Thread Level Parallelism (TTP) over Multicore Platforms

  14. MotivationforSPHeRe • Effective ontology matching is a computationally intensive (processing power and memory) operation requiring matching algorithms with quadratic complexity to be executed over candidate ontologies • Gross et al. “On Matching Large Life Science Ontologies in Parallel”, Lecture Notes in Computer Science (LNCS), 2010 • Delay in matching results, makes ontology matching ill-equipped for semi-real-time , semantic web-based systems • Stoilos et al. “A string metric for ontology alignment” ISWC’05, Heidelberg, Germany 2005 • The core techniques for achieving better performance are either related to the optimization of matching algorithms or the fragmentation of ontologies for matching algorithms . Utilization of parallel and distributed platforms has largely been missing • P. Shvaiko and J Euzenat “Ontology matching: State of the art and future challenges” IEEE Transaction on Knowledge and Data Engineering, January 2013 • Commodity hardware capable of parallelism i.e., multi-core processors over a distributed platform (Cloud) • Amin et. Al “High Performance Java Sockets (HPJS) for scientific Health Clouds” 13th IEEE HealthCom, Beijing 2012 • Cloud is affordable (utility-based pricing), cloud is available (ubiquitous) • Armbrust et al. “ A view of Cloud Computing” ACM Communication April 2010 “Research Opportunity: Ontology Matching over parallel and distributed commodity hardware”

  15. Implementation Challenges • “End to end Parallelism” Resolution: Methodology to exploit for parallelism from loading till delivery

  16. Implementation Challenges 2. “Memory Strain” • Amount of related information not required at the moment of time, flooding Memory • Parsing and Loading for Inference vs. Parsing and Loading for Matching • Java Heap Blow-up (2 GB Heap is not Enough) • Unable to iterate over properties of FMA and NCI • Cloud Instances have limited memory per instance • 2. Resolution: • Load what we need (Smaller memory foot print during execution)

  17. Implementation Challenges 3. “Accuracy Preservation” 3. Resolution: Decoupling of Matching Algorithms from Distribution

  18. Implementation Challenges 4. “Thread Safety” • Shared ontology data among multiple threads (synchronize access leads to sequential execution) • The available owl frameworks are not thread safe • Result guarantee • 4. Resolution: • Thread Safe ontology model, shared among multithreaded execution

  19. Implementation Challenges 5. “Scalability with optimal resource utilization” • Exploit the available computational resources for concurrency with equality (Effective load balancing) • Implementation of right parallelism technique (partitioning) • Better reduction rate • 5. Resolution: • Effective distribution of matching requests over available computational resources

  20. SPHeRe Architecture

  21. Matcher Distribution • The matching request received by the system is subdivided from macro (matching request) to micro (matching task) level

  22. Matcher Distribution

  23. Inter-node Communication

  24. Mappings Aggregation • Responsible for accumulating the matched results, creating a corresponding Bridge Ontology (Mapping), and its delivery

  25. SPHeRe Performance Evaluation Large Scale Biomedical Ontology Matching tool over High Performance Computing

  26. Scenario – I: Multicore desktop

  27. Scenario – II: 4 VM Cloud

  28. Ontology Loading Time 3 x Faster Loading time

  29. Total Memory Footprint 8 x Memory efficient

  30. Scalability (Reduction Score) Outperforms by 40%

  31. Performance Evaluation ~4 to 8 x Performance efficient

  32. Performance Evaluation (FMAxNCI)

  33. Performance Evaluation (FMAxSNOMED)

  34. Performance Evaluation (NCIxSNOMED)

  35. Uniqueness / Contributions Exploitation of Parallel Commodity hardware for matching • Implementing data parallelism based distribution over subsets of candidate ontologies of ontology subsets over multicore hardware of multicore platform and provides a collection of mappings among the ontologies as a bridge ontology file End-to-End Performance Initiative (from loading till delivery) • Creating subsets of ontologies depending on the needs of matching algorithms and caches them in serialized formats, providing a single-step ontology loading for matching algorithms in parallel Smaller Memory footprint • Each subset is lightweight due to matcher-based and redundancy-free creation, providing smaller memory footprints and contributing in overall system performance Better Scalability • Utilization of computational resources most efficiently with the help of its matching task distribution

  36. Achievements • OAEI 2013. Evaluation at ISWC 2013 (A-Rated Conference) • SPHeRe was presented and evaluated over large-scale biomedical track • It was remarked as the first Ontology Matching system that utilizes distributed Cloud resource • Our first release of this year ranked among the top-15 systems of 2013 (globally) • Microsoft Research Asia Award 2013-2014 • Research Funding Awarded by Microsoft Research Asia for SPHeRe over Microsoft Azure platform. • Microsoft Azure4Research Award 2014-2015 • SPHeRe for Large scale Biomedical Ontology Matching over Microsoft Azure Platform

  37. Publications • Conferences • Wajahat Ali Khan, Muhammad Bilal Amin, AsadMasoodKhattak, MaqboolHussain, and Sungyoung Lee, “System for Parallel Heterogeneity Resolution (SPHeRe) results for OAEI 2013”12th Int. Semantic Web Conference (ISWC), 21-25 October 2013, Sydney, Australia. • Ammar Ahmad Awan, Muhammad Bilal Amin, ShujaatHussain, AamirShafi and Sungyoung Lee, “An MPI-IO Compliant Java based Parallel I/O library”, 13th IEEE CCGrid. Delft , Netherlands, May 2013 • Ammar Ahmad Awan, Muhammad ShoaibAyub, AamirShafi and Sungyoung Lee, “Towards Efficient Support for Parallel I/O in Java HPC”, 13th PDCAT, Beijing 2012. • Muhammad Bilal Amin, Wajahat Ali Khan, ShujaatHussain and Sungyoung Lee, “High Performance Java Sockets (HPJS) for healthcare cloud systems”, 13thHealthCom 2012, Beijing, Oct 2012. • Muhammad Bilal Amin, Wajahat Ali Khan, Ammar Ahmad Awan and Sungyoung Lee, “Intercloud Message Exchange Middleware”, 7th ICUIMC 2012, Kuala Lampur, Malaysia, Feb 2012.

  38. Publications • Journals • Muhammad Bilal Amin, Wajahat Ali Khan and Sungyoung Lee, “SPHeRe: A performance initiative towards ontology matching by implementing parallelism over cloud platforms”, Jr. of Supercomputing (SCI, IF 0.9), 2013 • Wajahat Ali Khan, MaqboolHussain, Muhammad Afzal, Muhammad Bilal Amin, Muhammad AamirSaleem, and Sungyoung Lee, “Personalized-Detailed Clinical Model for Data Interoperability among Clinical Standards”, Telemedicine and EHealth (SCI, IF:1.416), 2013 • Muhammad Bilal Amin, Wajahat Ali Khan and Sungyoung Lee, “Enabling Data Parallelism for Large-scale Bio-medical Ontology Matching over Multicore Platforms”, Jr. of Applied Intelligence (SCI, IF 1.8) (under review), 2014

  39. Conclusion • HPC over cloud is a very cost effective solution with all the ability that can be provided by expensive clusters or grids • To fully exploit its utilization, efforts are required to implement platforms and applications for computation and data intensive problems. • Applications like SPHeRe can be built to provide resolution of compute and data intensive problems over multicore platforms for performance needs. • Commodity hardware consumes lesser man hours for maintenance and consume far less of energy which makes it an excellent candidate for “Green Computing”.

  40. Thank you

  41. References • N. Carriero, M. V. Osier, K.-H. Cheung, P. L. Miller, M. Gerstein, H. Zhao, B. Wu, S. Rifkin, J. T. Chang, H. Zhang, K. White, K. Williams, M. H. Schultz, Case report: A high productivity/low main- tenance approach to high-performance computation for biomedicine: Four case studies., JAMIA 12 (1) (2005) 90–98. • G. Bueno, R. Gonzlez, O. Dniz, M. Garca-Rojo, J. Gonzlez-Garca, M. Fernndez-Carrobles, N. Vllez, J. Salido, A parallel solution for high resolution histological image analysis, Computer Methods and Programs in Biomedicine 108 (1) (2012) 388 – 401. doi:http://dx.doi.org/10.1016/j.cmpb.2012. 03.007. • F. Perez, J. Huguet, R. Aguilar, L. Lara, I. Larrabide, M. Villa-Uriol, J. Lpez, J. Macho, A. Rigo, J. Rossell, S. Vera, E. Vivas, J. Fernndez, A. Arbona, A. Frangi, J. H. Jover, M. G. Ballester, Radstation3g: A platform for cardiovascular image analysis integrating pacs, 3d+t visualization and grid computing, Computer Methods and Programs in Biomedicine 110 (3) (2013) 399 – 410. doi:http://dx.doi.org/10.1016/j.cmpb.2012.12.002. • A. Eklund, M. Andersson, H. Knutsson, fmri analysis on the gpupossibilities and challenges, Computer Methods and Programs in Biomedicine 105 (2) (2012) 145 – 161. doi:http://dx.doi.org/10.1016/ j.cmpb.2011.07.007. • E. I. Konstantinidis, C. A. Frantzidis, C. Pappas, P. D. Bamidis, Real time emotion aware applications: A case study employing emotion evocative pictures and neuro-physiological sensing enhanced by graphic processor units, Computer Methods and Programs in Biomedicine 107 (1) (2012) 16 – 27, advances in Biomedical Engineering and Computing: the conference case. doi:http://dx.doi.org/10.1016/j. cmpb.2012.03.008. • H. L ́opez-Fern ́andez, M. Reboiro-Jato, D. Glez-Pea, F. Aparicio, D. Gachet, M. Buenaga, F. Fdez- Riverola, Bioannote: A software platform for annotating biomedical documents with application in medical learning environments, Computer Methods and Programs in Biomedicine 111 (1) (2013) 139 – 147. doi:http://dx.doi.org/10.1016/j.cmpb.2013.03.007. • J. Cimino, X. Zhu, of on, IMIA Yearbook of Medical 1 (1) (2006) 124–135. • D. Isern, D. Snchez, A. Moreno, Ontology-driven execution of clinical guidelines, Computer Methods and Programs in Biomedicine 107 (2) (2012) 122 – 139. doi:http://dx.doi.org/10.1016/j.cmpb. 2011.06.006. • P. De Potter, H. Cools, K. Depraetere, G. Mels, P. Debevere, J. De Roo, C. Huszka, D. Colaert, E. Mannens, R. Van De Walle, Semantic patient information aggregation and medicinal decision support, Comput. Methods Prog. Biomed. 108 (2) (2012) 724–735. doi:10.1016/j.cmpb.2012.04.002. URL http://dx.doi.org/10.1016/j.cmpb.2012.04.002

More Related