River Trail: Adding Data Parallelism to JavaScript* - PowerPoint PPT Presentation

norris
river trail adding data parallelism to javascript n.
Skip this Video
Loading SlideShow in 5 Seconds..
River Trail: Adding Data Parallelism to JavaScript* PowerPoint Presentation
Download Presentation
River Trail: Adding Data Parallelism to JavaScript*

play fullscreen
1 / 28
Download Presentation
River Trail: Adding Data Parallelism to JavaScript*
98 Views
Download Presentation

River Trail: Adding Data Parallelism to JavaScript*

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. River Trail: Adding Data Parallelismto JavaScript* Stephan Herhut, Richard L. Hudson , Tatiana Shpeisman, Jaswanth Sreeram QCon NYC- June. 19, 2012 14:00

  2. JavaScript* – What You Need To Know • It is not Java* • Blend of many programming paradigms • Object oriented with prototypes • Higher-order functions and first class function objects • Dynamically typed and interpreted • Safety and security built in • Requirement for web programming • Managed runtime • No pointers, no overflows, … • Designed for portability • Fully abstracts hardware capabilities • No byte-codes, no dusty decks

  3. Concurrency in JavaScript* • Cooperative multi-tasking • Scripts compete with the browser for computing resources • Event driven execution model • Concurrent programming mindset • Asynchronous call-backs for latency hiding • Fully deterministic • Run-to-completion semantics • No concurrent side effects, no race conditions • No support for concurrent execution • Single threaded evaluation of JavaScript

  4. Yet Another Parallel Programming API? *

  5. Design Considerations

  6. Language Design with the Web in Mind • Ease of use • Build on developer’s existing knowledge • Allow for mash-up of sequential and parallel code • Platform independent • Support all kinds of platforms, parallel or not • Perform well on different parallel architectures (multi-core, GPUs, …) • Suitable for the Open Web • Meet existing safety and security promises • Needs to be reasonably easy to implement in JavaScript JIT engines Challenge: meet these criteria and get good performance

  7. Design Choices • Performance portability • Use High-Level Parallel Patterns • Deterministic execution model • No side effects: shared state is immutable • Require commutative and associative operators • No magic: floating point anomalies may still occur • Support mash-up coding • All code still written purely in JavaScript • Looks like JavaScript*, behaves like JavaScript* • Maintain JavaScript*’s Safety and Security • Use fully managed runtime

  8. River Trail API 3 Pillars: • ParallelArray • Methods • Kernel

  9. ParallelArray • Basic data type for parallel computation • Created from • A JavaScript array • Canvas • Comprehension • Immutable • Dense • Homogenous • Single or multiple dimensions

  10. ParallelArray Methods • Provide the basic skeletons for parallel computing • Typically creates a freshly minted ParallelArray • Combine, Reduce, Scan, Scatter, Filter, Map • Plus a constructor and accessor • Others can be built on top of the above • Sum, Max, Add, Gather, Histogram, etc. • Do Few Things Well

  11. Kernel Function • Methods take kernel function as an argument • Written purely in JavaScript, side effect free • combine and filter arguments • index and array • get can use the index regardless of depth (dimensionality) • reduce, scan • 2 values passed in 1 returned • scatter conflict arguments • Array of target indices, conflict function for collisions • map • Value passed as argument

  12. vari; var a = new Array (...); var b = new Array(a.length); for(i=0;i<a.length;i++){ b[i] = a[i] + 1; } Add 1 to Every Element in A Sequential Data parallel var a = new ParallelArray(...); var b = a.map( function(val){return val+1;} );

  13. vari; var a = new Array (...); var sum = 0; for (i=0; i<a.length; i++) { sum += a[i]; } Sum Reduce-Style Sequential Data parallel var sum = pa.reduce( (a, b) => a+b ); • Data Parallelism is Beautiful More complex example in backup slides if we have time…. varpa = new ParallelArray(...); var sum = pa.reduce( function (a, b) { return a + b; } );

  14. An Example: Grayscale Conversion • pixelData.map(toGrayScale) • .map(function toRGBA(color) { • return [color,color,color,255]; • } • ) toGrayScale – Given a pixel return the gray value}

  15. PrototypeImplementation

  16. Compiling River Trail (Prototype) JavaScript Engine • Type inference • Infers array types and shapes • Checks for side effects • Representation analysis • Computes bounds on local variables • Updates type information of known Integer numbers • Static memory allocation • Bounds check elimination • Code generation • Emits OpenCL code Script River Trail Compiler

  17. Compiling River Trail (Prototype) Hardware JavaScript Engine OpenCL Runtime Script OpenCL Kernel multi-core CPUs SIMD instructions River Trail Compiler GPU

  18. Particlemodel (O(n2)) computed using River Trail on a 2nd Generation Core i7 with 4 cores Performance Results: Particle Physics http://github.com/RiverTrail/RiverTrail/wiki

  19. Performance Results: Matrix Matrix Multiply O(n3) dense matrix matrix multiplication on 1000 x 1000 element matrices; dual-core 2nd Generation Core i5 with HyperThreadingenabled and 4GB RAM; JavaScript* benchmarks use Firefox 8

  20. Status Quo • Open source Firefox prototype available on GitHub • Pre-built binary extension for Firefox 12 • Sequential library fall back for other browsers • ECMAScript proposal of the full API published • Removes many limitations of the prototype • First sequential implementation for SpiderMonkey • Lives in Mozilla’s IonMonkey branch • Intended as API testing vehicle http://github.com/RiverTrail/RiverTrail/wiki http://wiki.ecmascript.org/doku.php?id=strawman:data_parallelism

  21. The other routes… Web Workers *

  22. What About Web Workers? Good for task parallelism Implement actors model • No shared state • Communication using messages Heavy weight • Typically implemented using OS threads • Marshaling / Unmarshaling uses JSON (think strings)

  23. What about WebCL • JavaScript binding for OpenCL • Provides HPC parallelism on CPU & GPGPU • Portable and efficient access to heterogeneous devices • WebCL stays close to the OpenCL standard • Preserves OpenCL familiarity to facilitates adoption • Allows developers to translate OpenCL knowledge to web • Easier to keep OpenCL and WebCL in sync, as two evolve • An interface just above OpenCL • Higher level abstractions built on top of WebCL • Intended for performance programmers • Useful HW abstraction • Allows ultimate control, performance, and access to HW

  24. Challenges • WebCL / OpenCL challenges • OpenCL standard leaves things undefined • For example out of bounds • OpenCL makes these the programmer’s responsibility • Not a reasonable approach for web Shared challenge –context management • GPUs do a poor job • Creates Denial of Service and Performance hazards • River Trail can fall back to JavaScript library or OpenCL CPU execution • Currently River Trail is focused on CPU

  25. River Trail WebCL • Gently extends C99 • Retrofits Web Security • Standardization by Khronos • Bifurcated JavaScript / OpenCL-C99 programming model, multiple tool chains • OpenCL (non) deterministic model • Apps: visual computing, physics simulation, games, augmented reality • Gently extended JavaScript • Preserves Web Security • Standardization by ECMA TC39 • Unified high level JavaScript programming model and tool chain • Execution determinism maintained • Apps: visual computing, physics simulation, games, augmented reality

  26. Q&A

  27. Dealing with boundary conditions • var pa = new ParallelArray( • new Int32Array([2,4,8,16,32])); • function blur(ind){return (this[i]+this[i-1])/2;}; • pa.combine(blur); -> throws error on this[-1] • function halo(boundary, work) { • return function (indx){ • if (indx < boundary) { return this[i];} • else { return work.apply(this, indx);}; • }; • }; • pa.combine(halo(3, blur)))->[2 4 8 12 24] • pa.combine(halo(1, blur)))->[2 3 6 12 24] • pa.combine(halo(pa.length-1, blur))) ->[2 4 8 16 24]