1 / 11

SciMark – All Compilers Summary

SciMark – All Compilers Summary. Testing done on Millennium (550MHz, Katmai), Titanium version 1.910 Except for Java testing, data collected 11/2/01, Java collected on 1/23/02. SciMark – Selected Compilers Small Dataset. Testing done on Millennium (550MHz, Katmai), Titanium version 1.910

nituna
Download Presentation

SciMark – All Compilers Summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SciMark – All Compilers Summary Testing done on Millennium (550MHz, Katmai), Titanium version 1.910 Except for Java testing, data collected 11/2/01, Java collected on 1/23/02

  2. SciMark – Selected Compilers Small Dataset Testing done on Millennium (550MHz, Katmai), Titanium version 1.910 Except for Java testing, data collected 11/2/01, Java collected on 1/23/02

  3. SciMark – Selected Compilers Large Dataset Testing done on Millennium (550MHz, Katmai), Titanium version 1.910 Except for Java testing, data collected 11/2/01, Java collected on 1/23/02

  4. SciMark – Titanium Version Comparisons Small Dataset All data collected on mm62 (550MHz, Katmai) on 1/23/02 Large Dataset

  5. PIER – Application Details • Network/Database Discrete Event Simulator • A query engine (relational join & group by) on top of a distributed hash table • Simulates end-to-end network communication (latency, bandwidth divided among flows, etc.) • Application written in Java for compatibility with other Berkeley database research projects • Software Engineering • Over 200 class files, heavy use of inheritance, polymorphism, etc. • About 25,000 lines of code (and not too many comments yet) • Layered, easily ported to real, working implementation • Some parts of the simulation are faked for performance reasons, tuples are kept small (<100bytes), but simulated at >1Kb • Primarily an object moving program with some processing (string manipulations, basic math, etc.) • All objects are kept in memory, disk I/O is minimal (for result logging) and not timed in following slides

  6. PIER – Language Summary 83.3% (0.8% faster Java) 84.0% (5.1% faster Java ) 63.4% faster 62.7% 77.7% 83.1% • Small simulation • 64 Simulated Nodes • 5000 Tuples per table Testing done on Millennium (600MHz, 2G RAM), collected on 1/28/02

  7. PIER – Memory Footprint Memory usage & runtime grow exponential with primary simulation parameters (Test parameters same as previous slide)

  8. PIER – Parallel Attempts  • Parallel attempt with Titanium failed miserably • Negative speedup (our best almost matched sequential execution) • Simulated nodes were divided among processes, best version utilized out-of-order execution to improve performance, earlier versions used small time steps to keep all processes synchronized. • Problems we encountered • Lots of small remote accesses (when using 8 processes on 2 hosts, the MPI performance counters rolled over at least once) • All small accesses… due to the movement of our objects, with sub objects, and sub objects, and more sub objects. • Globally, processes were load balanced, within time steps they were not… various allocations of simulated nodes to processes were attempted • Application is more memory intensive then computationally bound

  9. Parallel Speedup Graph

  10. Parallel Execution Time Breakup Post Communication (Comm imbalance) Communication Pre Communication (Execution imbalance) Execution Region 10ms Async 300ms Heap 300ms List 300ms Vect

  11. Titanium Wish List • Titanium Features that would be nice for our application (yes, you can laugh at them) • Serialization to move objects with encapsulated objects • Better Memory Management (Regions just were not enough) • Global Garbage Collection • Directed memory deletion (i.e. delete object x) • Performance counters/profiling

More Related