1 / 25

Talk on X10

Talk on X10. X10 Overview. Challenges with Programming Models What is X10? X10 Programming model Coordination of activities Overview of features Hello World Program. Challenges with Programming Models. Challenges faced by current large scale systems Frequency Wall Memory Wall

linore
Download Presentation

Talk on X10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Talk on X10

  2. X10 Overview • Challenges with Programming Models • What is X10? • X10 Programming model • Coordination of activities • Overview of features • Hello World Program

  3. Challenges with Programming Models • Challenges faced by current large scale systems • Frequency Wall • Memory Wall • Scalability Wall • Increase in complexity of large-scale parallel systems – decrease in software productivity for developing, debugging and maintaining application • Available programming languages – Sisal, Fortran 90, HP Fortran, Co-Array Fortran • Ultimate Challenge: high productivity, high performance programming • Programming model – simple, widely usable yet efficiently implementable on current and proposed architecture without much compilation errors • MPI – most common model for high performance on large-scale systems, but has productivity limitations inherent in use • Java – Most popular highly productive language with single threaded application

  4. What is X10? • X10 is an experimental new language whose goal is to design adaptable scalable systems with increase in programming productivity for future systems like PERCS, without degrading performance. • To increase Productivity – OO programming model and then raises abstraction levels • Atomic sections – locks • Clocks – barriers • Asynchronous operations - threads • To increase Performance transparency – integrates new constructs – places, regions and distributions to model hierarchical parallelism and non-uniform data access. • X10 is a strongly typed language – static type checking and static expression of program invariants -> improves programmer’s productivity and performance.

  5. X10 Programming Model • A central concept in X10 is a place. • A place is a collection of resident light-weight threads and data. It is intended to map to a data-coherent unit in a large scale system such as an SMP node or a single co-processor. • It contains number of activities and a bounded amount of storage. • Four storage classes • Activity-local : Private to the activity, located to the place where the activity executes • Place-local : Private to a place, can be accessed coherently by all activities executing in the same place • Partitioned-global : each element has a unique place but element is accessible by both local as well as remote activities • Values : immutable and stateless. • 2 types of data objects • Scalar • Aggregate (Array)

  6. Fine grained concurrency • async S • Atomicity • atomic S • when (c) S • Global data-structures • points, regions, distributions, arrays • Ordering • finish S • clock • Place-shifting operations • at (P) S Two basic ideas: Places and Asynchrony

  7. Async – remote data can be accessed by spawning asynchronous activities at the places at which data is resident. • async (P) S • Asynchronous activity may return a value to the invoking activity are called ‘futures’ • Foreach – activities spawned in the local place as a high-level abstraction of multithreading • Ateach – serves as a convenient mechanism for spawning activities across a set of local/remote places or objects.

  8. Coordination of activities • Clocks - generalization of barriers, which have been used as a basic synchronization primitive for MPI process groups • Clocks are designed to offer the functionality of multiple barriers in the context of dynamic, async, hierarchical networks of activities. • Special value class instance, on which a restricted set of operations can be performed • At any given time activity is registered with zero or more clocks • An activity may register other activities with a clock or may un-register itself with a clock. • Activity may quiesce on the clocks it is registered with and suspend until all of them have advanced. • Force operations – F = future (P) E • X10 does not allow the invoking activity A, to register the spawned activity B with any of the clocks A is registered with. • E is not allowed to invoke a conditional atomic sections

  9. Unconditional Atomic Sections – A statement block or method is atomic if it is being executed by an activity in a single step, during which all other activities are frozen. • Generalization of user-controlled locking. • Leaves responsibility of lock management and other mechanisms for enforcing atomicity to the language implementation • Avoid including long-running or blocking operations in an atomic sections. • Conditional Atomic Sections – when (c) S • If guard c is false in the current state, the activity executing the statement blocks until c becomes true. • A conditional atomic section for which the condition c is statically true is considered to be unconditional atomic section.

  10. Substantial extensions to the type system Dependent types Generic types Function types Type definitions, inference Concurrency Fine-grained concurrency: async (p,l) S Atomicity atomic (s) Ordering L: finish S Data-dependent synchronization when (c) S Overview of Features • Many sequential features of Java inherited unchanged • Classes (w/ single inheritance) • Interfaces, (w/ multiple inheritance) • Instance and static fields • Constructors, (static) initializers • Overloaded, over-rideable methods • Garbage collection • Structs • Closures • Points, Regions, Distributions, Arrays

  11. Classes • Classes • Single inheritance, multiple interfaces • May have mutable instance fields • Values of class types may be null • Heap allocated • Distributed Object Model • Remote references with global identity • Rooted state: lives in place where object was created • Global state • programmer specified subset of immutable state • serialized with object; available anywhere that has remote ref • methods may be global as well (access only global state)

  12. Structs struct Complex { val real:double; val img : double; def this(r:double, i:double) { real = r; img = i; } def operator + (that:Complex) { return Complex(real + that.real, img + that.img); } .... } • User defined primitives • No inheritance • May implement interfaces • All fields are final • All methods are final • Allocated “inline” in containing object/array/variable • Headerless • Instances of structs may be freely copied from place to place

  13. Regions are collections of points of the same dimension Rectangular regions have a simple representation, e.g. [1..10, 3..40] Rich algebra over regions is provided Points and Regions • A point is an element of an n-dimensional Cartesian space (n>=1) with integer-valued coordinates e.g., [5], [1, 2], … • A point variable can hold values of different ranks e.g., • var p: Point = [1]; p = [2,3]; ... • Operations • p1.rank • returns rank of point p1 • p1(i) • returns element (i mod p1.rank) ifi < 0 or i >= p1.rank • p1 < p2, p1 <= p2, p1 > p2, p1 >= p2 • returns true iff p1 is lexicographically <, <=, >, or >= p2 • only defined when p1.rank and p2.rank are equal

  14. Array operations A.rank ::= # dimensions in array A.region ::= index region (domain) of array A.dist ::= distribution of array A A(p) ::= element at point p, where p belongs to A.region A(R) ::= restriction of array onto region R Useful for extracting subarrays Distributions and Arrays • Distributions specify mapping of points in a region to places • E.g. Dist.makeBlock(R) • E.g. Dist.makeUnique() • Arrays are defined over a distribution and a base type • A:Array[T] • A:Array[T](d) • Arrays are created through initializers • Array.make[T](d, init) • Arrays are mutable (considering immutable arrays)

  15. Generic classes public abstract value class Rail[T] (length: int) implements Indexable[int,T], Settable[int,T] { private native def this(n: int): Rail[T]{length==n}; public native def get(i: int): T; public native def apply(i: int): T; public native def set(v: T, i: int): void; } • Classes and interfaces may have type parameters • class Rail[T] • Defines a type constructor Rail • and a family of types Rail[int], Rail[String], Rail[Object], Rail[C], ... • Rail[C]: as if Rail class is copied and C substituted for T • Can instantiate on any type, including primitives (e.g., int)

  16. Dependent types are checked statically. Dependent types used to statically check locality properties (place types) Dependent type system is extensible Dependent Types • Classes have properties • public final instance fields • class Region(rank: int, zeroBased: boolean, rect: boolean) { ... } • Can constrain properties with a boolean expression • Region{rank==3} • type of all regions with rank 3 • Array[int]{region==R} • type of all arrays defined over region R • R must be a constant or a final variable in scope at the type

  17. Closures First-class functions (x: T): U => e used in array initializers: Array.make[int]( 0..4, (p: point) => p(0)*p(0) ) the array [ 0, 1, 4, 9, 16 ] Operators int.+, boolean.&, ... sum = a.reduce(int.+, 0) Function Types • (T1, T2, ..., Tn) => U • type of functions that take arguments Tiand returns U • If f: (T) => U and x: T then invoke with f(x): U • Function types can be used as an interface • Define apply method with the appropriate signature: def apply(x:T): U

  18. Loop index types inferred from region R: Region{rank==2} for (p in R) { ... } p has type Point{rank==2} Type inference • Field, local variable types inferred from initializer type val x = 1; • x has type int{self==1} val y = 1..2; • y has type Region{rank==1} • Method return types inferred from method body def m() { ... return true ... return false ... } • m has return type boolean

  19. async Stmt ::= async(p,l) Stmt cf Cilk’s spawn • async S • Creates a new child activity that executes statement S • Returns immediately • S may reference final variables in enclosing blocks • Activities cannot be named • Activity cannot be aborted or cancelled // Compute the Fibonacci // sequence in parallel. def run() { if (r < 2) return; val f1 = new Fib(r-1), f2 = new Fib(r-2); finish { async f1.run(); f2.run(); } r = f1.r + f2.r; }

  20. finish Stmt ::= finish Stmt cf Cilk’s sync • L: finish S • Execute S, but wait until all (transitively) spawned asyncs have terminated. Rooted exception model • Trap all exceptions thrown by spawned activities. • Throw an (aggregate) exception if any spawned async terminates abruptly. • implicit finish at main activity finish is useful for expressing “synchronous” operations on (local or) remote data. // Compute the Fibonacci // sequence in parallel. def run() { if (r < 2) return; val f1 = new Fib(r-1), f2 = new Fib(r-2); finish { async f1.run(); f2.run(); } r = f1.r + f2.r; }

  21. at Stmt ::= at(p)Stmt • at(p) S • Execute statement S at place p • Current activity is blocked until S completes // Copy field f from a to b def copyRemoteFields(a, b) { at (b.loc) b.f = at (a.loc) a.f; } // Increment field f of obj def incField(obj, inc) { at (obj.loc) obj.f += inc; } // Invoke method m on obj def invoke(obj, arg) { at (obj.loc) obj.m(arg); }

  22. atomic Stmt ::= atomic Statement MethodModifier ::= atomic • atomic S • Execute statement S atomically • Atomic blocks are conceptually executed in a single step while other activities are suspended: isolation and atomicity. • An atomic block body (S) ... • must be nonblocking • must not create concurrent activities (sequential) • must not access remote data (local) // target defined in lexically // enclosing scope. atomic def CAS(old:Object, n:Object) { if (target.equals(old)) { target = n; return true; } return false; } // push data onto concurrent // list-stackval node = new Node(data);atomic { node.next = head; head = node; }

  23. when Stmt ::= WhenStmt WhenStmt ::=when ( Expr ) Stmt | WhenStmt or (Expr) Stmt • when (E) S • Activity suspends until a state inwhich the guard E is true. • In that state, S is executed atomically and in isolation. • Guard E is a boolean expression • must be nonblocking • must not create concurrent activities (sequential) • must not access remote data (local) • must not have side-effects (const) await (E) • syntactic shortcut for when (E) ; class OneBuffer { var datum:Object = null; var filled:Boolean = false; def send(v:Object) { when ( !filled ) { datum = v; filled = true; } } def receive():Object { when ( filled ) { val v = datum; datum = null; filled = false; return v; } } }

  24. Parallel HelloWorld import x10.io.Console; class HelloWorldPar { public static def main(args:Rail[String]):void { finish ateach (p in Dist.makeUnique()) { Console.OUT.println("Hello World from Place" +p); } } } (%1) x10c++ -o HelloWorldPar -O HelloWorldPar.x10 (%2) mpirun -n 4 HelloWorldPar Hello World from Place(0) Hello World from Place(2) Hello World from Place(3) Hello World from Place(1) (%3)

  25. Thank You...

More Related