Download presentation
Presentation is loading. Please wait.
Published byGarey Greene Modified over 9 years ago
1
Explorations into Internet Distributed Computing Kunal Agrawal, Ang Huey Ting, Li Guoliang, and Kevin Chu
2
Project Overview Design and implement a simple internet distributed computing framework Compare application development for this environment with traditional parallel computing environment.
3
Grapevine An Internet Distributed Computing Framework - Kunal Agrawal, Kevin Chu
4
What is Internet Distributed Computing?
5
Motivation Supercomputers are very expensive Large numbers of personal computers and workstations around the world are naturally networked via the internet Huge amounts of computational resources are wasted because many computers spend most of their time idle Growing interest in grid computing technologies
6
Other Distributed Computing Efforts
7
Internet Distributed Computing Issues Nodes reliability Network quality Scalability Security Cross platform portability of object code Computing Paradigm Shift
8
Overview Of Grapevine
9
Client Application Grapevine Server Grapevine Volunteer Grapevine Volunteer Grapevine Volunteer
10
Grapevine Features Written in Java Parametrized Tasks Inter-task communication Result Reporting Status Reporting
11
Un-addressed Issues Node reliability Load Balancing Un-intrusive Operation Interruption Semantics Deadlock
12
Meta Classifier - Ang Huey Ting, Li Guoliang
13
Classifier Function(instance) = {True,False} Machine Learning Approach Build a model on the training set Use the model to classify new instance Publicly available packages : WEKA(in java), MLC++.
14
Meta Classifier Assembly of classifiers Gives better performance Two ways of generating assembly of classifiers Different training data sets Different algorithms Voting
15
Building Meta Classifier Different Train Datasets - Bagging Randomly generated ‘bags’ Selection with replacement Create different ‘flavors’ of the training set Different Algorithms E.g. Naïve Bayesian, Neural Net, SVM Different algorithms works well on different training sets
16
Why Parallelise? Computationally intensive One classifier = 0.5 hr Meta classifier (assembly of 10 classifiers) = 10 *0.5 = 5 hr Distributed Environment - Grapevine Build classifiers in parallel independently Little communication required
17
Distributed Meta Classifiers WEKA- machine learning package University of Waikato, New Zealand http://www.cs.waikato.ac.nz/~ml/we ka/ Implemented in Java Including most popular machine learning tools
18
Distributed Meta-Classifiers on Grapevine Distributed Bagging Generate different Bags Define bag and Algorithm for each task Submit tasks to Grapevine Node build Classifiers Receive results Perform voting
19
Preliminary Study Bagging on Quick Propagation in openMP Implemented in C
20
Trial Domain Benchmark corpus Reuters21578 for Text Categorization 9000+ train documents 3000+ test documents 90+ categories Perform feature selection Preprocess documents into feature vectors
21
Summary Successful internet distributed computing requires addressing many issues outside of traditional computer science Distributed computing is not for everyone
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.