Meta Learning and Active Learning: Meta Learning and Active Learning: Collaborative Knowledge Discovery in Distributed Systems Dr Yonghong Peng Department.

Slides:



Advertisements
Similar presentations
Some questions o What are the appropriate control philosophies for Complex Manufacturing systems? Why????Holonic Manufacturing system o Is Object -Oriented.
Advertisements

By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
ECML Estimating the predictive accuracy of a classifier Hilan Bensusan Alexandros Kalousis.
Machine Learning Neural Networks
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
Machine Learning Case study. What is ML ?  The goal of machine learning is to build computer systems that can adapt and learn from their experience.”
Week 9 Data Mining System (Knowledge Data Discovery)
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Clementine Server Clementine Server A data mining software for business solution.
6/25/2015 Acc 522 Fall 2001 (Jagdish S. Gangolly) 1 Data Mining I Jagdish Gangolly State University of New York at Albany.
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.
Chapter 12: Intelligent Systems in Business
Feature Selection and Its Application in Genomic Data Analysis March 9, 2004 Lei Yu Arizona State University.
Developing Intelligent Agents and Multiagent Systems for Educational Applications Leen-Kiat Soh Department of Computer Science and Engineering University.
Machine Learning: Ensemble Methods
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
Module 3: Business Information Systems
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Experiment Databases: Towards better experimental research in machine learning and data mining Hendrik Blockeel Katholieke Universiteit Leuven.
Data Mining Techniques
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Module 3: Business Information Systems Chapter 11: Knowledge Management.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Understanding Data Analytics and Data Mining Introduction.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Issues with Data Mining
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Model of Prediction Error in Chaotic and Web Driven Business Environment Franjo Jović*, Alan Jović ** * Faculty of Electrical Engineering, University of.
Presented by Tienwei Tsai July, 2005
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Swarm Computing Applications in Software Engineering By Chaitanya.
Learning outcomes for BUSINESS INFORMATCIS Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST)
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Web Services and Application of Multi-Agent Paradigm for DL Yueyu Fu & Javed Mostafa School of Library and Information Science Indiana University, Bloomington.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Theoretic Frameworks for Data Mining Reporter: Qi Liu.
Multiagent System Katia P. Sycara 일반대학원 GE 랩 성연식.
Design Reuse Earlier we have covered the re-usable Architectural Styles as design patterns for High-Level Design. At mid-level and low-level, design patterns.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Learning Analytics isn’t new Ways in which we might build on the long history of adaptive learning systems within contemporary online learning design Professor.
SNS COLLEGE OF TECHNOLOGY
What Is Cluster Analysis?
Model Discovery through Metalearning
MIS2502: Data Analytics Advanced Analytics - Introduction
Web Services and Application of Multi-Agent Paradigm for DL
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
ALZHEIMER DISEASE PREDICTION USING DATA MINING TECHNIQUES P.SUGANYA (RESEARCH SCHOLAR) DEPARTMENT OF COMPUTER SCIENCE TIRUPPUR KUMARAN COLLEGE FOR WOMEN.
A Unifying View on Instance Selection
Prepared by: Mahmoud Rafeek Al-Farra
C.U.SHAH COLLEGE OF ENG. & TECH.
COSC 4335: Other Classification Techniques
Presentation transcript:

Meta Learning and Active Learning: Meta Learning and Active Learning: Collaborative Knowledge Discovery in Distributed Systems Dr Yonghong Peng Department of Computing School of Informatics University of Bradford A+B=?

Meta Learning and Active Learning: Meta Learning and Active Learning: Collaborative Knowledge Discovery in Distributed Systems  Knowledge Communication in Distributed Systems;  Knowledge Discovery/Management.  Meta-Learning and Active Learning;  CKD framework and Key Techniques. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Systems are toward distributed Systems are toward distributed Centralized Systems: Their actions are coordinated based on the communicating with one control centre. This kind of centralized communication is usually less efficient. Collaborative Knowledge Discovery in Distributed Systems Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Systems are toward distributed Systems are toward distributed Collaborative Knowledge Discovery in Distributed Systems Decentralized Systems: all component are autonomous, and their actions are coordinated based on their communication. Free of central control. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Characteristics of Distributed Systems Characteristics of Distributed Systems Their effectiveness and efficiency rely on the capability of collaboration among the components/agents. The capability of collaboration comes from the ability of communication between all the components. Collaborative Knowledge Discovery in Distributed Systems Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Issues in Distributed Systems Issues in Distributed Systems Collaborative Knowledge Discovery in Distributed Systems Instead of using data communication, knowledge sharing is the key for a successful distributed system. An new concept is called knowledge mobility. Data Communication:  Data communication is time-consuming and expensive.  Data collected from a variety of sources are always heterogeneous. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Communication via Data or Knowledge Communication via Data or Knowledge Collaborative Knowledge Discovery in Distributed Systems This is what I have! I do not know what it is. This is what you need! It is A not B, and A is bigger than B 20%. Communication via Data Communication via Knowledge Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Knowledge is not easy to obtain Knowledge is not easy to obtain Collaborative Knowledge Discovery in Distributed Systems Data is available everywhere but is difficult to use;  Data is easy to collect: availability of large amount of data. Knowledge is easy to use but is difficult to obtain.  Shortage of domain experts;  Knowledge obtained from different experts may be Inconsistent. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Knowledge Discovery and Management Knowledge Discovery and Management Collaborative Knowledge Discovery in Distributed Systems knowledge management: to Make use of knowledge effectively Knowledge Verification Knowledge Updating; Knowledge Reuse. Strategies Knowledge Discovery: To extract knowledge from data using machine learning and data mining techniques. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Knowledge Discovery and Management Knowledge Discovery and Management Collaborative Knowledge Discovery in Distributed Systems Drawback of the current techniques: -- They are inefficient as they are working in a passive style! Data Mining: we do not know what we will get. No interaction between the existing knowledge of upcoming mining activities. Knowledge management are performed after the knowledge has been extracted, not within the process of learning. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Collaborative Knowledge Discovery (CKD) Collaborative Knowledge Discovery (CKD) Collaborative Knowledge Discovery in Distributed Systems New Strategies: Managing the knowledge when learning. Collaborative Learning: To learn how to work with others Learn according to what I want to know. Learn according to what the partners want to know. Idea: Meta-Learning and Active Learning based Collaborative Knowledge Discovery. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Meta-Learning and Active Learning Meta-Learning and Active Learning Collaborative Knowledge Discovery in Distributed Systems What is Meta-Learning?  Meta-learning is to learn how a learner works. Applications:  To select the suitable algorithm(s) (Current);  MetaL European Project (  To learn the collaborative knowledge (NEW). Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Meta-Learning Process Meta-Learning Process Collaborative Knowledge Discovery in Distributed Systems Data Collection Knowledge generation Machine Learner Data pre- processing Expected Performances Application Objectives Meta- Learner Meta- Knowledge Data Characterisation Model/Algorithm Characterisation End-users Meta-Data Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Meta-Learning for Algorithm Selection Meta-Learning for Algorithm Selection Collaborative Knowledge Discovery in Distributed Systems New Data Post- processing Machine Learner Pre- processing [LA1,f1,f2,……….., Acc1, Time1] …………. [LAn,f1,f2,……….., Acc_n, Time_n] …………. Ranker Data Characterisations Ranking ALs LA1; LA3; …… Ranking LAs AL2; AL1; …… Meta-Knowledge Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Active Learning Active Learning Collaborative Knowledge Discovery in Distributed Systems What is Active-Learning? Data Knowledge Data DM Data Miner Active Learner Knowledge Passive Learning: Data-Driven learning: Active Learning: Objective-Driven learning: Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Collaborative Knowledge Discovery in Distributed Systems Active Learning Active Learning Tasks of Active Learning: keep on ansowering 1)What can I learn? 2)How to Learn? Active Learning Process: Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Collaborative Knowledge Discovery in Distributed Systems Meta-Learning based Active Learning Meta-Learning based Active Learning Approaches: 1)what I can learn from the data: Data Characteristic Techniques (DCT); Perform the rough mining (RM) with the sampled data; 2)Select appropriate methods: Select the target models according to the objectives; Using Meta-Learning to select the learning methods. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Data Characterisation Techniques (DCT) Data Characterisation Techniques (DCT)  StatLog type DCT:  Simple Measures (e.g. number of attributes, classes et al.)  Statistical Measures (e.g. mean of numerical attributes)  Information-based measures (e.g. entropy of classes)  Histograms based DCT  information regarding the distribution of values of attributes with relational nature (e.g. mutual information between symbolic attributes and class)  Landmarking  use the performance of simple (fast) learners to predict the performance of candidate algorithms Collaborative Knowledge Discovery in Distributed Systems Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

New Data Characterisation Techniques New Data Characterisation Techniques  Idea:  Capturing information from the Standard Decision Tree model (or other models) [Peng, IDDM2002, DS2002].  Approach:  using standard decision tree method: C5.0  measuring the size, structure and shape of tree. Collaborative Knowledge Discovery in Distributed Systems x1 x2x3 x2x4 x5 C1C3 C1C2C3C1 C4 Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Meta-Learning for Algorithm Selection Meta-Learning for Algorithm Selection Collaborative Knowledge Discovery in Distributed Systems 4. Rank LAs according to the accuracy and time C5.0  13%, 30s Ltree  8%, 35s … Adversor System: Given new data set 1. characterize it general (# attributes, # examples,...) statistical (skewness, kurtosis,...) information-theoretic (class entropy,...) 2. select k neighbors 3. retrieve performance information accuracy + time

Collaborative Knowledge Discovery in Distributed Systems Collaborative Knowledge Discovery Framework Collaborative Knowledge Discovery Framework Application of Meta-Learning and Active Learning: 1) Control the process of each learning agent actively. 2) Coordinate activities of multiple learning agents. 3) Synthesize the outcome of distributed learning agents. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Collaborative Knowledge Discovery in Distributed Systems Collaborative Knowledge Discovery Framework Collaborative Knowledge Discovery Framework Key techniques in collaborative learning: 1)To improve my situation 1)What do I need? 2)Where can I get it? 3)How to get it? 2)To improve the partners’ situation 1)What do other need? 2)Do I possibly have? 3)How can I get it efficiently? – Objective driven problem solving; – On-line information retrieval; – Meta-learning and Meta-knowledge; – Knowledge communication; – On-line DCT or data probe; – Meta-learning and meta-knowledge. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Collaborative Knowledge Discovery in Distributed Systems Collaborative Knowledge Discovery- Summary Collaborative Knowledge Discovery- Summary -Knolwedge discovery deals with extracting interesting associations, classifiers, clusters and patterns, which are previous unknown, from data; -The emergence of network-based computing has introduced a new dimension to this problem, i.e., the distributed sources of data and computing. -Advanced analysis of distributed data for extracting useful knowledge is the next natural step. -The existing data mining algorithms are designed to work for centralized data, and they often do not pay attention to the distributed resource. -Collaborative Knowledge Discovery is a new strategy and approach. Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.

Meta Learning and Active Learning: Meta Learning and Active Learning: Collaborative Knowledge Discovery in Distributed Systems Thanks! Dr Yonghong Peng, Department of Computing, School of Informatics, University of Bradford.