Instance Construction via Likelihood- Based Data Squashing Madigan D., Madigan D., et. al. (Ch 12, Instance selection and Construction for Data Mining.

Slides:



Advertisements
Similar presentations
Neural Networks and Kernel Methods
Advertisements

Copyright 2011, Data Mining Research Laboratory Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining Xintian Yang, Srinivasan.
Image Modeling & Segmentation
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Advanced topics.
Biointelligence Laboratory, Seoul National University
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
Data Mining Techniques Outline
Radial Basis Functions
Presented by Ozgur D. Sahin. Outline Introduction Neighborhood Functions ANF Algorithm Modifications Experimental Results Data Mining using ANF Conclusions.
Neural Networks Chapter Feed-Forward Neural Networks.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Data Mining with Neural Networks (HK: Chapter 7.5)
Distributed Representations of Sentences and Documents
Chapter 5 Data mining : A Closer Look.
Modeling the Behavior of the S&P 500 Index Mary Malliaris Loyola University Chicago 10 th IEEE Conference on Artificial Intelligence for Applications.
Evaluating Performance for Data Mining Techniques
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Biointelligence Laboratory, Seoul National University
Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.
Explorations in Neural Networks Tianhui Cai Period 3.
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Self Organization of a Massive Document Collection Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Teuvo Kohonen et al.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Eurostat On the use of data mining for imputation Pilar Rey del Castillo, EUROSTAT.
Performance Model for Parallel Matrix Multiplication with Dryad: Dataflow Graph Runtime Hui Li School of Informatics and Computing Indiana University 11/1/2012.
Hierarchical Clustering of Gene Expression Data Author : Feng Luo, Kun Tang Latifur Khan Graduate : Chien-Ming Hsiao.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Adaptive Sampling Methods for Scaling up Knowledge Discovery Algorithms From Ch 8 of Instace selection and Costruction for Data Mining (2001) From Ch 8.
Biointelligence Laboratory, Seoul National University
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Progressive Sampling Instance Selection and Construction for Data Mining Ch 9. F. Provost, D. Jensen, and T. Oates 신수용.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
Learning with Perceptrons and Neural Networks
Predicting Salinity in the Chesapeake Bay Using Neural Networks
COMP24111: Machine Learning and Optimisation
Mastering the game of Go with deep neural network and tree search
Prof. Carolina Ruiz Department of Computer Science
Prepared by: Mahmoud Rafeek Al-Farra
Neuro-Computing Lecture 4 Radial Basis Function Network
Eco 6380 Predictive Analytics For Economists Spring 2016
network of simple neuron-like computing elements
Biointelligence Laboratory, Seoul National University
Multilayer Perceptron: Learning : {(xi, f(xi)) | i = 1 ~ N} → W
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
A connectionist model in action
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
Structure of a typical back-propagated multilayered perceptron used in this study. Structure of a typical back-propagated multilayered perceptron used.
Linear Discrimination
Learning and Memorization
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Instance Construction via Likelihood- Based Data Squashing Madigan D., Madigan D., et. al. (Ch 12, Instance selection and Construction for Data Mining (2001), (Ch 12, Instance selection and Construction for Data Mining (2001), Kruwer Academic Publishers) Summarize: Jinsan Yang, SNU Biointelligence Lab

 Abstract Data Compression Method: Squashing LDS: Likelihood based data squashing  Keywords Instance Construction, Data Squashing

Outline  Introduction  The LDS Algorithm  Evaluation: Logistic Regression  Evaluation: Neural Networks  Iterative LDS  Discussion

Introduction  Massive data examples Large-scale retailing Telecommunications Astronomy Computational biology Internet logging  Some computational challenges Need of multiple passes for data access 10^5~6 times slower than main memory Current Solution:Scaling up existing algorithm Here: Scaling down the data  Data squashing:  8443 ( DuMouchel et al (1999), Outperforms by a factor of 500 in MSE than random sample of size 7543

LDS Algorithm  Motivation: Bayesian rule Given three data points d1,d2,d3, estimate the parameter : Clusters by likelihood profile:

LDS Algorithm  Details of LDS Algorithm [Select] Values of by a central composite design Central composite Design for 3 factors

LDS Algorithm [Profile] Evaluate the likelihood profiles [Cluster] Cluster the mother data in a single pass -Select n’ random samples as initial cluster centers -Assign the remaining data to each cluster [Construct] Construct the Pseudo data: cluster center

Evaluation: Logistic Regression Small-scale simulations: Initial estimate of Plot: Log (Error Ratio) Three methods of initial parameter estimations 100 data / 48 squashed data

Evaluation: Logistic Regression  Medium Scale: , base: 1% simple random sampling

Evaluation: Logistic Regression  Large Scale: , base: 1% simple random sampling

Evaluation: Neural Networks  Feed forward, two input nodes, one hidden layer with 3 units, Single binary output  Mother data: 10000, Squashed data: 1000, repetitions:30 test data: 1000 from the same network  Comparisons for P(whole) - P(reduced)

Evaluation: Neural Networks

Iterative LDS  When the estimation of is not accurate. 1. Set from simple random sampling 2. Squash by LDS 3. Estimate 4. Go to 2.