1 / 29

Frédéric Gava

A Modular Impl ementation of Parallel Data Structures in Bulk-Synchronous Parallel ML. Frédéric Gava. Outline. Introduction; The BSML language; Impl emen tation of parallel data structures in BSML : Dictionaries; Set s; Load -Balancing. Application; Conclusion and futur works.

Download Presentation

Frédéric Gava

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML Frédéric Gava F. Gava, HLPP 2005

  2. Outline • Introduction; • The BSML language; • Implementation of parallel data structures in BSML: • Dictionaries; • Sets; • Load-Balancing. • Application; • Conclusion and futur works. F. Gava, HLPP 2005

  3. Introduction Automatic Parallelization Structured Parallelism Concurrent Programming Algorithmic Skeletons BSP Data Structures Skeletons • Parallel Computing for speed; • To complex for many non-computer scientists; • Need for models/tools of parallelism. F. Gava, HLPP 2005

  4. Introduction (bis) • Observations: • Data Structures also important as algorithms; • Symbolic computations used massively those data structures. • Suggested solution, parallel implementations of data structures: • Interfacesas close as possible to the sequential ones; • Modular implementations to have a straightforwardmaintenance; • Load-balancing of the data. F. Gava, HLPP 2005

  5. BSML Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005

  6. Bulk-Synchronous Parallelism + Functional Programming = BSML • Advantages of the BSP model: • Portability; • Scalability, deadlock free; • Simple cost model  performance prediction. • Advantages of functional programming: • High-level features (higher order functions, pattern-matching, concrete types, etc…); • Savety of the environment; • Programs Proofs (proof of BSML programs using Coq). F. Gava, HLPP 2005

  7. The BSMLLanguage • Confluent language: deterministic algorithms; • Library for the « Objective Caml » language (called BSMLlib); • Operations to access to the BSP parameters; • 5 primitives on a parallel data structure called parallel vector: • mkpar:create a parallel vector; • apply: parallel point-wise application; • put: send values within a vector; • proj: parallel projection; • super: BSP divide-and-conquer. F. Gava, HLPP 2005

  8. A BSML Program f0 g0 f1 g1 … … fp-1 gp-1 Sequential part Parallel part F. Gava, HLPP 2005

  9. Superthreads in BSML

  10. Parallel Data Structures in BSML Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005

  11. General Points • 5 modules: Set, Map, Stack, Queue, Hashtable; • Interfaces: • Same as O’Caml ones; • With some specific parallel functions (squeletons) as parallel reduction; • Pure functional implementationx (for functional data); • Manual or Automatic load-balancing. F. Gava, HLPP 2005

  12. Modules in O’Caml • Interface: module type Compare = sig type elt val compare : elt -> elt -> int end • Implementation: module CompareInt = struct type elt=int let tools = ... let compare = ... end module AbstractCompareInt = (CompareInt : Compare) • Functor: module Make(Ord: Compare) = struct type elt = Ord.elt type t = Empty | Node of t * elt * t * int let mem e s = ... end

  13. Parallel Dictionaries • A parallel map (dictionary) = a mapon each processor: module Make (Ord : OrderedType)(Bal:BALANCE) (MakeLocMap:functor(Ord:OrderedType) -> Map.S with type key=Ord.t) = struct module Local_Map = MakeLocMap(Ord) type key = Ord.t type 'a t = ('a Local_Map.t par) * int * bool type seq_t = Local_Map.t (* operators as skeletons *) end • We need to re-implement all the operations (data skeletons). F. Gava, HLPP 2005

  14. Insert a Binding • add: key  'a  'a t  'a t If rebalanced Otherwise F. Gava, HLPP 2005

  15. Parallel Iterator Let cardinal pmap=ParMap.fold (fun _ _ ii+1) 0 pmap • Foldneed to respect the order of the keys; • Parallel map sequential map; • Too many communications… • async_fold: (key'a'b'b)'a t'b'b par let cardinal pmap=List.fold left (+) 0 (total(ParMap.async fold (fun _ _ ii+1) pmap 0)) F. Gava, HLPP 2005

  16. Parallel Sets • A sub-set on each processor; • Insert/Iteration as parallel maps; • But with some binary skeletons; • Load-balancing of couples of parallel sets using thesuperposition. F. Gava, HLPP 2005

  17. Difference • 3 cases: • Two normal parallel sets; • One of the parallel sets has been rebalanced; • The two parallel sets have been rebalanced; • Imply a problem with duplicate elements. F. Gava, HLPP 2005

  18. Difference (third case) S1 S2 F. Gava, HLPP 2005

  19. Load-Balancing (1) • « Same sizes » of the local data structures; • Better performances for parallel iterations; • Load-Balancing in 2 super-steps (M. Bamha and G. Hains) using a histogram F. Gava, HLPP 2005

  20. Load-Balancing (2) • Generic code of the algorithm: Select « n » messages Data ||  datas • rebalance: (par)  (int  g list * ) •  (  )  (glist ) • (par  ) (  int par) •    Union Messages  data Datas  data || Histogram Data || F. Gava, HLPP 2005

  21. Application Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005

  22. Computation of the « nth » nearest neighbors atom in a molecule • Code from «Objective Caml for Scientists » (J. Harrop); • Molecule as a infinitely-repeated graph of atoms; • Computation of sets differences (the neighbors); • Replace « fold » with « async_fold »; • Experimentswitha silicate of 100.000 atoms and with a cluster of 5/10 machines (Pentium IV, 2.8 Ghz, Gigabit Ethernet Card). F. Gava, HLPP 2005

  23. Experiments (1)

  24. Experiments (2)

  25. Experiments (3)

  26. Conclusion and Futur Works Outline: • Introduction; • BSML; • Parallel Data Structures in BSML; • Application; • Conclusion and futur works. F. Gava, HLPP 2005

  27. Conclusion • BSML=BSP+ML; • Implementation of some data structures; • Modular for a simple development and maintenance; • Pure functional implementation; • Cost prediction with the BSP model; • Generic Load-balancing; • Application. F. Gava, HLPP 2005

  28. Futur Works • Proof of the implementations (pure functional); • Implementation of another data structures (tree, priority list etc.); • Application to another scientist problems; • Comparison with another parallel ML (OCamlP3L, HirondML, OCaml-Flight, MSPML etc.); • Development of a modular and parallel graph library: • Edges as parallel maps; • Vertex as parallel sets. F. Gava, HLPP 2005

  29. F. Gava, HLPP 2005

More Related