explorations into internet distributed computing
Download
Skip this Video
Download Presentation
Explorations into Internet Distributed Computing

Loading in 2 Seconds...

play fullscreen
1 / 21

Explorations into Internet Distributed Computing - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

Explorations into Internet Distributed Computing. Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu. Project Overview. Design and implement a simple internet distributed computing framework

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Explorations into Internet Distributed Computing' - andren


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
explorations into internet distributed computing

Explorations into Internet Distributed Computing

Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu

project overview
Project Overview

Design and implement a simple

internet distributed computing

framework

Compare application development for this environment with traditional parallel computing environment.

grapevine

Grapevine

An Internet Distributed Computing Framework

- Kunal Agrawal, Kevin Chu

motivation
Motivation
  • Supercomputers are very expensive
  • Large numbers of personal computers and workstations around the world are naturally networked via the internet
  • Huge amounts of computational resources are wasted because many computers spend most of their time idle
  • Growing interest in grid computing technologies
internet distributed computing issues
Internet Distributed Computing Issues
  • Nodes reliability
  • Network quality
  • Scalability
  • Security
  • Cross platform portability of object code
  • Computing Paradigm Shift
slide9

Client Application

Grapevine

Server

Grapevine

Volunteer

Grapevine

Volunteer

Grapevine

Volunteer

grapevine features
Grapevine Features
  • Written in Java
  • Parametrized Tasks
  • Inter-task communication
  • Result Reporting
  • Status Reporting
un addressed issues
Un-addressed Issues
  • Node reliability
  • Load Balancing
  • Un-intrusive Operation
  • Interruption Semantics
  • Deadlock
meta classifier

Meta Classifier

- Ang Huey Ting, Li Guoliang

classifier
Classifier
  • Function(instance) = {True,False}
  • Machine Learning Approach
    • Build a model on the training set
    • Use the model to classify new instance
  • Publicly available packages : WEKA(in java), MLC++.
meta classifier1
Meta Classifier
  • Assembly of classifiers
  • Gives better performance
  • Two ways of generating assembly of classifiers
    • Different training data sets
    • Different algorithms
  • Voting
building meta classifier
Building Meta Classifier
  • Different Train Datasets - Bagging
    • Randomly generated ‘bags’
    • Selection with replacement
    • Create different ‘flavors’ of the training set
  • Different Algorithms
    • E.g. Naïve Bayesian, Neural Net, SVM
    • Different algorithms works well on different training sets
why parallelise
Why Parallelise?
  • Computationally intensive

One classifier = 0.5 hr

Meta classifier (assembly of 10 classifiers) = 10 *0.5 = 5 hr

  • Distributed Environment - Grapevine
    • Build classifiers in parallel independently
    • Little communication required
distributed meta classifiers
Distributed Meta Classifiers
  • WEKA- machine learning package
    • University of Waikato, New Zealand
    • http://www.cs.waikato.ac.nz/~ml/weka/
    • Implemented in Java
    • Including most popular machine learning tools
distributed meta classifiers on grapevine
Distributed Meta-Classifiers on Grapevine

Distributed Bagging

  • Generate different Bags
  • Define bag and Algorithm for each task
  • Submit tasks to Grapevine
  • Node build Classifiers
  • Receive results
  • Perform voting
preliminary study
Preliminary Study
  • Bagging on Quick Propagation in openMP
    • Implemented in C
trial domain
Trial Domain
  • Benchmark corpus Reuters21578 for Text Categorization
    • 9000+ train documents
    • 3000+ test documents
    • 90+ categories
    • Perform feature selection
    • Preprocess documents into feature vectors
summary
Summary
  • Successful internet distributed computing requires addressing many issues outside of traditional computer science
  • Distributed computing is not for everyone
ad