Machine Learning Applications in Grid Computing

Slides:



Advertisements
Similar presentations
PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is.
Advertisements

Applications of one-class classification
VC Dimension – definition and impossibility result
Visual Recognition Tutorial
Effective Coordination of Multiple Intelligent Agents for Command and Control The Robotics Institute Carnegie Mellon University PI: Katia Sycara
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
Measuring Model Complexity (Textbook, Sections ) CS 410/510 Thurs. April 27, 2007 Given two hypotheses (models) that correctly classify the training.
Computational Learning Theory
Probably Approximately Correct Model (PAC)
Evaluating Hypotheses
Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.
Approximating Power Indices Yoram Bachrach(Hebew University) Evangelos Markakis(CWI) Ariel D. Procaccia (Hebrew University) Jeffrey S. Rosenschein (Hebrew.
Experimental Evaluation
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Numerical Grid Computations with the OPeNDAP Back End Server (BES)
PAC learning Invented by L.Valiant in 1984 L.G.ValiantA theory of the learnable, Communications of the ACM, 1984, vol 27, 11, pp
Markov Localization & Bayes Filtering
Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
1 CS546: Machine Learning and Natural Language Discriminative vs Generative Classifiers This lecture is based on (Ng & Jordan, 02) paper and some slides.
ENM 503 Lesson 1 – Methods and Models The why’s, how’s, and what’s of mathematical modeling A model is a representation in mathematical terms of some real.
Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.
1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)
Mobile Agent Migration Problem Yingyue Xu. Energy efficiency requirement of sensor networks Mobile agent computing paradigm Data fusion, distributed processing.
Computational Learning Theory IntroductionIntroduction The PAC Learning FrameworkThe PAC Learning Framework Finite Hypothesis SpacesFinite Hypothesis Spaces.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
Goal of Learning Algorithms  The early learning algorithms were designed to find such an accurate fit to the data.  A classifier is said to be consistent.
Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University Today: Computational Learning Theory Probably Approximately.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Generalization Error of pac Model  Let be a set of training examples chosen i.i.d. according to  Treat the generalization error as a r.v. depending on.
Computacion Inteligente Least-Square Methods for System Identification.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
A service Oriented Architecture & Web Service Technology.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004.
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.
CS 9633 Machine Learning Support Vector Machines
Understanding Sampling Distributions: Statistics as Random Variables
Deep Feedforward Networks
New Characterizations in Turnstile Streams with Applications
Computational Learning Theory
Computational Learning Theory
CH. 2: Supervised Learning
PSG College of Technology
Jun Liu Department of Statistics Stanford University
Vapnik–Chervonenkis Dimension
Interval Estimation.
 Real-Time Scheduling via Reinforcement Learning
Computational Learning Theory
Computational Learning Theory
 Real-Time Scheduling via Reinforcement Learning
Computational Learning Theory Eric Xing Lecture 5, August 13, 2010
Matching Conflicts: Functional Validation of Agents
Machine Learning: UNIT-3 CHAPTER-2
Quality-aware Middleware
Evaluating Hypothesis
Presentation transcript:

Machine Learning Applications in Grid Computing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartmouth College 22th Sept.,1999, 37th Allerton Conference Urbana-Champaign, Illinois Acknowledgements: This work was partially supported by AFOSR grants F49620-97-1-0382, NSF grant CCR-9813744 and DARPA contract F30602-98-2-0107.

Grid vision Grid computing refers to computing in a distributed networked environment in which computing and data resources are located throughout the network. Grid infrastructures provide basic infrastructure for computations that integrate geographically disparate resources, create a universal source of computing power that supports dramatically new classes of applications. Several efforts are underway to build computational grids such as Globus, Infospheres and DARPA CoABS.

Grid services A fundamental capability required in grids is a directory service or broker that dynamically matches user requirements with available resources. Prototype of grid services Request Client Server Service Service Location Request Reply Advertise Matchmaker

Matching conflicts Brokers and matchmakers use keywords and domain ontologies to specify services. Keywords and ontologies cannot be defined and interpreted precisely enough to make brokering or matchmaking between grid services robust in a truly distributed, heterogeneous computing environment. Matching conflicts exist between client’s requested functionality and service provider’s actual functionality.

An example A client requires a three-dimensional FFT. A request is made to a broker or matchmaker for a FFT service based on the keywords and possibly parameter lists. The broker or matchmaker uses the keywords to retrieve its catalog of services and returns with the candidate remote services. Literally dozens of different algorithms for FFT computations with different assumptions, dimensions, accuracy, input-output format and so on. The client must validate the actual functionality of these remote services before the client commits to use it.

Functional validation Functional validation means that a client presents to a prospective service provider a sequence of challenges. The service provider replies to these challenges with corresponding answers. Only after the client is satisfied that the service provider’s answers are consistent with the client’s expectations is an actual commitment made to using the service. Three steps: Service identification and location. Service functional validation. Commitment to the service

Our approach Challenge the service provider with some test cases x1, x2, ..., xk . The remote service provider R offers the corresponding answers fR(x1), fR(x2), ..., fR(xk). The client C may or may not have independent access to the answers fC(x1), fC(x2), ..., fC(xk). Possible situations and machine learning models: C “knows” fC(x) and R provides fR(x). PAC learning and Chernoff bounds theory C “knows” fC(x) and R does not provide fR(x). Zero-knowledge proof C does not “know” fC(x) and R provides fR(x). Simulation-based learning and reinforcement learning

Mathematical framework The goal of PAC learning is to use few examples as possible, and as little computation as possible to pick a hypothesis concept which is a close approximation to the target concept. Define a concept to be a boolean mapping . X is the input space. c(x)=1 indicates x is a positive example , i.e. the service provider can offer the “correct” service for challenge x. Define an index function Now define the error between the target concept c and the hypothesis h as .

Mathematical framework(cont’d) The client can randomly pick m samples to PAC learn a hypothesis h about whether the service provider can offer the “correct” service . Theorem 1(Blumer et.al.) Let H be any hypothesis space of finite VC dimension d contained in , P be any probability distribution on X and the target concept c be any Borel set contained in X. Then for any , given the following m independent random examples of c drawn according to P , with probability at least , every hypothesis in H that is consistent with all of these examples has error at most .

Simplified results Assuming that with regard to some concepts, all test cases have the same probability about whether the service provider can offer the “correct” service. Theorem 2(Chernoff bounds): Consider independent identically distributed samples , from a Bernoulli distribution with expectation . Define the empirical estimate of based on these samples as Then for any , if the sample size , then the probability . Corollary 2.1: For the functional validation problem described above, given any , if the sample size , then the probability .

Simplified results(cont’d) Given a target probability P, the client needs to know how many positive consecutive samples are required so that the next request to the service will be correct with probability P. So probabilities , and P have the following inequality: Formulate the sample size problem as the following nonlinear optimization problem: s.t. and

Simplified results(cont’d) From the constraint inequality, Then transfer the above two dimensional function optimization problem to the one dimensional one: s.t. Elementary nonlinear functional optimization methods.

Mobile Functional Validation Agent Send Create Mobile Agent User Interface User Agent A’s Service Correct Correct Service Jump B’s Service Correct Incorrect Incorrect MA MA MA Machine C, D, E, ... Interface Agent Interface Agent Computing Server B Computing Server A C, D, E, ….. Machine B Machine A

Future work and open questions Integrate functional validation into grid computing infrastructure as a standard grid service. Extend to other situations described(like zero-knowledge proofs, etc.). Formulate functional validation problems into more appropriate mathematical models. Explore solutions for more difficult and complicated functional validation situations. Thanks!!