Presentation is loading. Please wait.

Presentation is loading. Please wait.

Explorations into Internet Distributed Computing Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu.

Similar presentations


Presentation on theme: "Explorations into Internet Distributed Computing Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu."— Presentation transcript:

1 Explorations into Internet Distributed Computing Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu

2 Project Overview Design and implement a simple internet distributed computing framework Compare application development for this environment with traditional parallel computing environment.

3 Grapevine An Internet Distributed Computing Framework - Kunal Agrawal, Kevin Chu

4 What is Internet Distributed Computing?

5 Motivation Supercomputers are very expensive Large numbers of personal computers and workstations around the world are naturally networked via the internet Huge amounts of computational resources are wasted because many computers spend most of their time idle Growing interest in grid computing technologies

6 Other Distributed Computing Efforts

7 Internet Distributed Computing Issues Nodes reliability Network quality Scalability Security Cross platform portability of object code Computing Paradigm Shift

8 Overview Of Grapevine

9 Client Application Grapevine Server Grapevine Volunteer Grapevine Volunteer Grapevine Volunteer

10 Grapevine Features Written in Java Parametrized Tasks Inter-task communication Result Reporting Status Reporting

11 Un-addressed Issues Node reliability Load Balancing Un-intrusive Operation Interruption Semantics Deadlock

12 Meta Classifier - Ang Huey Ting, Li Guoliang

13 Classifier Function(instance) = {True,False} Machine Learning Approach Build a model on the training set Use the model to classify new instance Publicly available packages : WEKA(in java), MLC++.

14 Meta Classifier Assembly of classifiers Gives better performance Two ways of generating assembly of classifiers Different training data sets Different algorithms Voting

15 Building Meta Classifier Different Train Datasets - Bagging Randomly generated ‘bags’ Selection with replacement Create different ‘flavors’ of the training set Different Algorithms E.g. Naïve Bayesian, Neural Net, SVM Different algorithms works well on different training sets

16 Why Parallelise? Computationally intensive One classifier = 0.5 hr Meta classifier (assembly of 10 classifiers) = 10 *0.5 = 5 hr Distributed Environment - Grapevine Build classifiers in parallel independently Little communication required

17 Distributed Meta Classifiers WEKA- machine learning package University of Waikato, New Zealand http://www.cs.waikato.ac.nz/~ml/we ka/ Implemented in Java Including most popular machine learning tools

18 Distributed Meta-Classifiers on Grapevine Distributed Bagging Generate different Bags Define bag and Algorithm for each task Submit tasks to Grapevine Node build Classifiers Receive results Perform voting

19 Preliminary Study Bagging on Quick Propagation in openMP Implemented in C

20 Trial Domain Benchmark corpus Reuters21578 for Text Categorization 9000+ train documents 3000+ test documents 90+ categories Perform feature selection Preprocess documents into feature vectors

21 Summary Successful internet distributed computing requires addressing many issues outside of traditional computer science Distributed computing is not for everyone


Download ppt "Explorations into Internet Distributed Computing Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu."

Similar presentations


Ads by Google