1 / 49

Goals

Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010. Goals. Explain volunteer computing Teach how to create a volunteer computing project using BOINC Target audience: High-throughput computing users Technical skills:

danil
Download Presentation

Goals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Volunteer Computingwith BOINCDr. David P. AndersonUniversity of California, BerkeleySC10Nov. 14, 2010

  2. Goals Explain volunteer computing Teach how to create a volunteer computing project using BOINC Target audience: High-throughput computing users Technical skills: Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)

  3. Outline Why use volunteer computing? Basic concepts of BOINC Developing BOINC applications (15 minute break) Deploying a BOINC server Deploying applications Submitting jobs Organizational issues

  4. Part 1:Why use volunteer computing?

  5. The Consumer Digital Infrastructure 1 billion PCs current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS total) Storage: ~1,000 Exabytes Commodity Internet: 10-1,000 Mbps to home Consumers pay for hardware sysadmin network costs electricity

  6. Volunteer computing PC owners donate computing resources to projects (e.g., computational science) Applications run at zero priority while PC in use, and/or while PC is not in use

  7. Examples Project start where area peak #hosts GIMPS 1994 math 10,000 distributed.net 1995 cryptography 100,000 SETI@home I 1999 UCB SETI 600,000 Folding@home 1999 Stanford biology 200,000 United Devices 2002 commercial biomedicine 200,000 CPDN 2003 Oxford climate change 150,000 LHC@home 2004 CERN physics 60,000 Predictor@home 2004 Scripps biology 100,000 WCG 2004 commercial biomedicine 200,000 Einstein@home 2005 LIGO astrophysics 200,000 SETI@home II 2005 UCB SETI 850,000 Rosetta@home 2005 U. Wash biology 100,000 SIMAP 2005 T.U. Munich bioinformatics 10,000 ... ... ... ... ...

  8. Current status • ~50 projects • 500,000 vounteers • 800,000 computers

  9. # processors 1 multiple jobs single job High-throughput computing cluster (batch) High-performance computing 100 cluster (MPI) Grid 1000 Commercial cloud supercomputer Volunteer computing 10K-1M

  10. Volunteer computing is different • You don’t buy resources; you ask for them • Resources are: • heterogeneous • sporadically available and connected • untrusted and not private • behind firewalls/NATs/proxies

  11. Part 2:Basic concepts of BOINC

  12. About BOINC Funded by NSF since 2002 Open-source (LGPL) Based at UC Berkeley Few staff, but lots of volunteers software testing translation documentation support (email lists, message boards, Skype)

  13. Volunteers and projects projects volunteers LHC@home CPDN attachments WCG

  14. BOINC software overview MySQL daemons scheduler data server HTTP project server GUI client screensaver volunteer host apps

  15. BOINC scheduler applications app versions - HW, SW description - existing workload - per resource type: # of instances requested # of seconds requested Win32 Win32 + NVIDIA Win64 Win32 N-core Mac OS X jobs - app version descriptions - job descriptions instances

  16. Job replication • Job instances may fail or return wrong results • Job replication: do 2, see if they agree • “agree” may be fuzzy • Homogeneous replication • numerical equivalence of hosts • Adaptive replication • reduce replication for hosts that seem trustworthy

  17. The job pipeline work generator BOINC validator assimilator

  18. The BOINC data model • App versions, job inputs, job output can consist of arbitrarily many files • Each file has a physical name (unique, immutable); each reference to a file has a “logical name” • Files have various attributes (e.g., sticky) • Each file can have one or more URLs, and are transferred via HTTP • App version files are digitally signed

  19. What kinds of jobs can BOINC handle? • Pretty much anything you’d run on a Grid • Bag of tasks (but IPC support soon) • Short/long jobs • Data intensive, up to a point • Geared towards • Few apps, many jobs (high startup cost per app) • Jobs with high slack time

  20. Part 3:Application development for BOINC

  21. The BOINC runtime environment processes files

  22. Native BOINC applications • boinc_init() • create runtime system thread • boinc_finish() • write finish file • boinc_resolve_filename(logical, physical) • boinc_fraction_done(x)

  23. Checkpointing • bool boinc_time_to_checkpoint() • call when in checkpointable state • boinc_checkpoint_done()

  24. The BOINC wrapper • Can use for legacy apps • XML input file lists sub-jobs • executable, input files • What it does: • interfaces to BOINC client • copies files to/from slot directory • runs executables • does checkpointing at sub-job level

  25. Building app versions • Linux • gcc • Windows • Visual Studio • minGW (gcc) • Mac OS X • xcode

  26. Multithread apps • boinc_init_parallel() • Allows suspend/resume of all threads • Unix: fork/exec • Windows: direct thread control

  27. GPU app versions • Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples) • Each version has a “plan class” • For each plan class, supply a function that determines • can app run on this host? • hardware, driver version, etc. • what resources will it use? • #CPUs, #GPUs, GPU RAM, etc.

  28. VM apps • Develop apps on your favorite OS • Create a VirtualBox VM image • App version consists of • VM wrapper (supplied by BOINC) • VM image • app executable

  29. Part 4:Deploying a BOINC server

  30. Hardware options • Native Linux host • download/compile BOINC software • BOINC server VM (VMware/Debian) • BOINC Amazon EC2 image

  31. Components of a project • Master URL • name • MySQL database • Directory hierarchy • A set of daemon processes and cron jobs

  32. Processes clients transitioner work generator scheduler file deleter validator DB purger assimilator feeder MySQL DB

  33. Project directory hierarchy • apps/ application files • bin/ daemon programs • cgi-bin/ BOINC scheduler and upload GCI • config.xml configuration file • download/ downloadable files • html/ web site; master URL points here • keys/ keys for code signing, upload auth • log_(hostname) daemon log files • project.xml list of platforms and apps • upload/ uploaded files

  34. BOINC database • platform • app • app_version • user • host • workunit • result • ...

  35. Creating a project • make_project name • creates • directory hierarchy • DB • mods for httpd.conf • crontab entry

  36. Project configuration and control • config.xml • scheduling and other options • list of daemons • list of periodic tasks • project control • bin/start: start daemons, enable scheduler • bin/stop: stop daemons, disable scheduler • bin/status

  37. Scaling a BOINC server • Components can run on different machines sharing a file system • Each component can be distributed • MySQL server is typically the bottleneck • 1 server machine can issue ~100K jobs/day; 4 machines can issue > 1 million

  38. Part 5:Deploying applications

  39. Adding an application • edit project.xml • run bin/xadd <app> <name>multi_thread</name> <user_friendly_name>Test multi-thread apps</user_friendly_name> </app>

  40. Adding an application version • Create application version directory • Sign files on offline computer • run bin/update_versions apps/ uppercase/ uppercase_6.14_windows_intelx86__cuda.exe/ uppercase_6.14_windows_intelx86__cuda.exe graphics_app=uppercase_graphics_6.14_windows_intelx86.exe logo.jpg Helvetica.txf

  41. Part 6:Submitting jobs

  42. Describing job inputs • Input template file <file_info> <number>0</number> </file_info> <workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> </file_ref> <target_nresults>1</target_nresults> <min_quorum>1</min_quorum> <command_line>-cpu_time 60</command_line> <rsc_fpops_bound>446797000000000</rsc_fpops_bound> <rsc_fpops_est>279248000000000</rsc_fpops_est> </workunit>

  43. Describing job outputs • Output template file <file_info> <name><OUTFILE_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url> </file_info> <result> <file_ref> <file_name><OUTFILE_0/></file_name> <open_name>out</open_name> </file_ref> </result>

  44. Submitting a job • Stage input files • Submit job cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa` create_work –appname A –wu_name B –wu_template C –result_template D

  45. Part 7:Organizational issues

  46. Single-scientist projects Need to: Port apps Get publicity interface with public maintain servers Not many research groups have the resources And it creates a lot of competing “brands”

  47. Umbrella projects Example: IBM World Community Grid publicity web development sysadmin app porting Project

  48. The Berkeley@home model • A university has • scientists • a powerful “brand” • PR resources • IT infrastructure • lots of alumni (UCB: 500,000)

  49. Hubs • nanoHUB: “science portal” for nanoscience • social network + “app store” • sharing of ideas, data, software • computational portal • HUBzero: generalization to other areas • currently ~20 hubs • Integration of BOINC with HUBzero • each hub has a volunteer computing project

More Related