CS 8520: Artificial Intelligence

Slides:



Advertisements
Similar presentations
Machine Learning Homework
Advertisements

Florida International University COP 4770 Introduction of Weka.
R for Classification Jennifer Broughton Shimadzu Research Laboratory Manchester, UK 2 nd May 2013.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Classification of the aesthetic value of images based on histogram features By Xavier Clements & Tristan Penman Supervisors: Vic Ciesielski, Xiadong Li.
WEKA (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains.
Model Evaluation Metrics for Performance Evaluation
Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park.
An Extended Introduction to WEKA. Data Mining Process.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
Appendix: The WEKA Data Mining Software
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
1 CSC 8520 Spring Paula Matuszek Kinds of Machine Learning Machine learning techniques can be grouped into several categories, in several ways: –What.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
CSC 8520 Fall, Paula Matuszek 1 CS 8520: Artificial Intelligence Lab 1 Paula Matuszek Fall, 2008.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
1 CSC 4510, Spring © Paula Matuszek CSC 4510 Support Vector Machines (SVMs)
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
CSC 9010 Spring, Paula Matuszek, Lillian Cassel 1 CS 9010: Semantic Web Protégé Lab Paula Matuszek Spring, 2006.
CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.
Computational Intelligence: Methods and Applications Lecture 33 Decision Tables & Information Theory Włodzisław Duch Dept. of Informatics, UMK Google:
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
A Brief Introduction and Issues on the Classification Problem Jin Mao Postdoc, School of Information, University of Arizona Sept 18, 2015.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning Homework Gaining familiarity with Weka, ML tools and algorithms.
Evaluating Classifiers
Session 7: Face Detection (cont.)
CSSE463: Image Recognition Day 11
Performance Measures II
K Nearest Neighbors and Instance-based methods
Waikato Environment for Knowledge Analysis
WEKA.
Sampath Jayarathna Cal Poly Pomona
How to Fix Android File Transfer Not Working on Mac?
Data Mining: Concepts and Techniques Course Outline
Data Mining Classification: Alternative Techniques
Machine Learning Week 1.
Intro to Expert Systems Paula Matuszek CSC 8750, Fall, 2004
CSSE463: Image Recognition Day 11
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
Identifying Confusion from Eye-Tracking Data
An Introduction to WEKA
CS Fall 2016 (Shavlik©), Lecture 2
Tutorial for WEKA Heejun Kim June 19, 2018.
CSCI N317 Computation for Scientific Applications Unit Weka
Support Vector Machine _ 2 (SVM)
Overview of Contract Association Batch Upload
Machine Learning with WEKA
Lecture 10 – Introduction to Weka
Evaluating Classifiers
Statistical Learning Introduction to Weka
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
CS639: Data Management for Data Science
Assignment 7 Due Application of Support Vector Machines using Weka software Must install libsvm Data set: Breast cancer diagnostics Deliverables:
Neural Networks Weka Lab
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

CS 8520: Artificial Intelligence Weka Lab Paula Matuszek Spring, 2013 CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Weka is Waikato Environment for Knowledge Analysis Machine Learning Software Suite from the University of Waikato Been under development for 20 years Well-developed, maintained, supported Open source Windows, Mac and Unix versions http://www.cs.waikato.ac.nz/ml/weka/index.html Lots of help available at the wiki: http://weka.wikispaces.com/ CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek ROC Curve {Receiver|Relative} Operating Characteristic Curve Name derives from signal detection theory Basically plots sensitivity on the Y axis against specificity on the X-axis (actually 1-specificity) Ideal would be (0,1). Random would be (0.5, 0.5) (in a balanced domain) Useful for evaluating a classifier comparing classifiers setting cutoffs for class membership CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek http://en.wikipedia.org/wiki/File:ROC_space-2.png CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek More Weka Last week -- cross-validated decision tree. Go through section 4.2 of the tutorial. What data set did you use? Which classifier did better based on the confusion matrix? What about the ROC curve? CSC 8520 Spring 2013. Paula Matuszek

Trying a Support Vector Classifier SMO is a support vector classifier http://weka.sourceforge.net/doc/weka/classifier s/functions/SMO.html libSVM is a faster SVM, but it is not installed with Weka; all that is there is a wrapper. CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Decision Tree vs SMO Repeat section 4.2, replacing the RandomForest classifier with SMO What were the results for your data source? CSC 8520 Spring 2013. Paula Matuszek

Moving on to the Weka Explorer Explore some of the data sets included with Weka. Restart Weka, using the Explorer instead of the KnowledgeFlow. Make sure the Proprocess step is highlighted Use the Open File Option to look at some of the data sets Choose one which is binary usually there is a feature just labeled class And looks interesting. CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Exploring with Weka Going to go through a different tutorial which uses the Explorer interface The tutorial is at http://www.ibm.com/developerworks/opens ource/library/os-weka2/index.html It uses data which can be downloaded at the Download section about 2/3 of the way down the page. CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Decision Tree Again The first part of the tutorial creates a decision tree using J48, as in the Knowledge Flow Tutorial. This should give exactly the same results as the KnowledgeFlow approach; it’s just a different interface. Which did you find easier? Try it on the data set you chose earlier. How well did it do? CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Clustering The second part of the tutorial uses a simpleKMeans cluster algorithm. Try it on the sample data they provide. Do the results for their data make sense? Set the number of clusters to 2 and try it on the data set you chose. Do the results make sense? Do the two clusters match the two classes in your data? Try it again removing the “class” feature. Do you still get reasonable results? CSC 8520 Spring 2013. Paula Matuszek

CSC 8520 Spring 2013. Paula Matuszek Explore! Go ahead and try some of the other capabilities in Weka. CSC 8520 Spring 2013. Paula Matuszek