Akbar Akbari Esfahani1, Theodor Asch2

Slides:



Advertisements
Similar presentations
Ensemble Learning – Bagging, Boosting, and Stacking, and other topics
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Data Mining Classification: Alternative Techniques
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
ACM GIS An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics,
MML, inverse learning and medical data-sets Pritika Sanghi Supervisors: A./Prof. D. L. Dowe Dr P. E. Tischer.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
Chapter 5 Data mining : A Closer Look.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,
Data Mining Techniques
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Bayesian Networks 4 th, December 2009 Presented by Kwak, Nam-ju The slides are based on, 2nd ed., written by Ian H. Witten & Eibe Frank. Images and Materials.
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
We introduce the use of Confidence c as a weighted vote for the voting machine to avoid low confidence Result r of individual expert from affecting the.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/Outdoor Scene Classification Problem.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
CLASSIFICATION: Ensemble Methods
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Monté Carlo Simulation  Understand the concept of Monté Carlo Simulation  Learn how to use Monté Carlo Simulation to make good decisions  Learn how.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.
Data Mining and Decision Support
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Neural Networks for EMC Modeling of Airplanes Vlastimil Koudelka Department of Radio Electronics FEKT BUT Metz,
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
Frank DiMaio and Jude Shavlik Computer Sciences Department
Data Science Credibility: Evaluating What’s Been Learned
Machine Learning: Ensemble Methods
Mohsen Riahi Manesh and Dr. Naima Kaabouch
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Objectives of the Course and Preliminaries
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Introduction to Soft Computing
Data Mining Lecture 11.
Data Mining Practical Machine Learning Tools and Techniques
Objective of This Course
Neural Networks Advantages Criticism
Pattern Recognition and Image Analysis
Graduate School of Information Sciences, Tohoku University
Graduate School of Information Sciences, Tohoku University
Ensemble learning.
Example: Academic Search
Comparative Evaluation of SOM-Ward Clustering and Decision Tree for Conducting Customer-Portfolio Analysis By 1Oloyede Ayodele, 2Ogunlana Deborah, 1Adeyemi.
Nearest Neighbors CSC 576: Data Mining.
Model generalization Brief summary of methods
Feature Selection Methods
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Multiple DAGs Learning with Non-negative Matrix Factorization
Machine Learning: Lecture 5
Presentation transcript:

Classifying Overlapping Data by Combining Meta Learners and Bayesian Networks Akbar Akbari Esfahani1, Theodor Asch2 University of Colorado Denver1, USGS – Crustal Geophysics and Geochemistry Science Center1,2, aesfahani@usgs.gov Abstract Training site and Data Acquisition Results from the Training Site The US Department of Defense is interested in classifying types of unexploded ammunition versus clutter at the Aberdeen Proving Grounds, MD. To this end, a hybrid model using numerical inversion and Kohonen’s Self Organizing Maps (SOM) [1]. While the hybrid approach has been successful, the numerical inversion is computationally intensive and thus time consuming. To overcome this problem, I use a single neural network model that combines meta learners with Bayesian Networks to achieve a acceptable accuracy and be computational tractable. So far the combinations of Dagging with a BayesNet algorithm classifies the ammunition 99.9% correctly. The training and testing of the network model is done in less than 30 seconds versus the 3 week period of the numerical inversion. Confusion Matrix of Results There are 6 types of ordinances and clutter dispersed thru out the field. ALLTEM is an on-time time-domain EM system that uses a continuous triangle-wave excitation to measure the target-step response rather than traditional impulse response. The system multiplexes through three orthogonal transmitting loops and records a total of different transmitting and receiving loop combinations with a spatial data sampling interval of 20 cm Time to train network model: About 1 sec. Time to perform a 10-fold cross validation on data: about 14 sec. Time for complete Model building: ~ 15 sec. Conclusion The Network We can train a model in approximately 15 seconds with 100% accuracy. Data used on the Neural Net model is the field generated time series. To train and test on a blind set, the time requirement is less then 30 seconds and can be performed by almost any field laptop. Dagging: Dagging uses stratified folds of the training [2], [3]. According to literature Dagging is particularly useful when building classifiers that have poor time complexity in terms of the number of readings. BayesNet: Bayesian networks refer to directed acyclic graphical models[4], a probabilistic graphical model that represents a set of random variables and their dependencies. Bayesian network represent the probabilistic relationship between cause and effect. Given the effects, the network computes the probabilities of various causes to the effects presented. The learning algorithm for Bayesian Network consists of two parts: an evaluation function of a given network based on the data and a search algorithm that searches through the space of possible networks. The K2 algorithm [5] was chosen as the evaluation function. It starts with an ordering of the attributes and processes each node then in turn using a greedy algorithm to consider adding edges from previously processed nodes to the current one. With each step, it adds the edge that maximizes the network’s score based on AIC statistics. Once a node cannot be refined any further, the algorithm moves to the next node. Next the searching algorithm estimates conditional probability tables of the network, directly from the data, once the structure of the network has been learned by the K2 algorithm. Goals and Objectives Distinguish clutter from ordinance. Discover an algorithm for real time field application. Algorithm should not rely on input from the inversion model. Network performance suitable for field. (Time < 5min) Future Application The algorithm remains to be tested in the field on a blind data set where a scoring can be assigned. References The Mathematical Problem [1] Friedel, M. J., Asch, T., & Oden, C. (2012). Hybrid analysis of multi-axis electromagnetic data for discrimination of munitions and explosives of concern. Geophysical Journal International. [2] Tang, K. M., & Witten, I. H. (1997). Stacking Bagged and Dagged Models. Fourteenth international Conference on Machine Learning (pp. 367-375). San Francisco: Morgan Kaufmann Publishers Inc. [3] Breiman, L. (1994). Bagging Predictors, Technical Report No. 421. Berkeley: University of California Berkeley. [4] Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition. Burlington, MA: Morgan Kaufmann. [5] Cooper, G. F., & Herskovits, E. (1992). A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, 9, 309-347. 11 of the 13 independent variables of the data set are collinear   Variable 1 Variable 2 Variable 3 - 9 Variable 10 Variable 11 1 0.997 Variable 3 0.995 0.999 1, … , 1 Variable 4 - 10 -0.947 -0.957 0.973, … , -0.963 -0.922 -0.933 -0.935, … , 0.982 0.983