1 / 1

Experiments with Distributed Training of Neural Networks on the Grid

Experiments with Distributed Training of Neural Networks on the Grid. Maciej Malawski 1 Marian Bubak 1,2 Elżbieta Richter-Wąs 3,4 Grzegorz Sala 3,5 Tadeusz Szymocha 3 1 Institute of Computer Science AGH, Mickiewicza 30, 30-059 Kraków, Poland

Download Presentation

Experiments with Distributed Training of Neural Networks on the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiments with Distributed Training of Neural Networks on the Grid Maciej Malawski1 Marian Bubak1,2Elżbieta Richter-Wąs3,4Grzegorz Sala3,5 Tadeusz Szymocha3 1Institute of Computer Science AGH, Mickiewicza 30, 30-059 Kraków, Poland 2Academic Computer Centre CYFRONET, Nawojki 11, 30-950 Kraków, Poland 3Institute of Nuclear Physics, Polish Academy of Sciences, Krakow, Poland 4Institute of Physics, Jagiellonian University, Kraków, Poland 5Faculty of Physics and Applied Computer Science AGH, Kraków, Poland {bubak,malawski}@agh.edu.pl, elzbieta.richter-was@cern.ch, sala@fatcat.ftj.agh.edu.pl, Tadeusz.Szymocha@ifj.edu.pl • Target application • High Energy Physics • Discrimination between signal and background events coming from the particle detector (simulation) • ROOT and Athena as basic data analysis tools • Challenges • Neural network training is a highly compute-intensive task – may need High Performance Computing • Finding optimal configuration may be time consuming: many experiments with various parameters – may need High Throughput Computing • Why neural networks • Once trained, are efficient and accurate • Applicable for classification and prediction • Proven in wide area of applications • Solution: The Grid • The distribution of the computation on a cluster of machines can lead to significant improvement in decreasing computation time. • Utilizing resources (multiple clusters) available on the Grid can make this task less time consuming for researcher. • Observation • Training of neural networks on the Grid requires many repeated tasks: • job preparation, • submission, • monitoring of status, • gathering results. • Performing them manually is time consuming for the researcher • → Preparation of tools for automating such tasks can facilitate the whole process considerably. • Our Goals • Develop the tools facilitating usage of Grid for multiple classification experiments • Investigate and validate algorithms for distributed neural network training • Allow seamless integration with data analysis tools such as ROOT • Testbed for our experiments: EGEE project • Virtual Organization for Central Europe • CYFRONET Kraków, PSNC Poznań, KFKI Budapest, CESNET Prague, TU Kosice Grid sites • Support for MPI applications

More Related