Mining Feature Importance: Applying Evolutionary Algorithms within a Web- based Educational System CITSA 2004, Orlando, July 2004 Behrouz Minaei-Bidgoli,

Slides:



Advertisements
Similar presentations
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Advertisements

Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Feature Selection Presented by: Nafise Hatamikhah
1 Wendy Williams Metaheuristic Algorithms Genetic Algorithms: A Tutorial “Genetic Algorithms are good at taking large, potentially huge search spaces and.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
Parallelized Evolution System Onur Soysal, Erkin Bahçeci Erol Şahin Dept. of Computer Engineering Middle East Technical University.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Data classification based on tolerant rough set reporter: yanan yean.
1 Genetic Algorithms. CS The Traditional Approach Ask an expert Adapt existing designs Trial and error.
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
1 Genetic Algorithms. CS 561, Session 26 2 The Traditional Approach Ask an expert Adapt existing designs Trial and error.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
On the Application of Artificial Intelligence Techniques to the Quality Improvement of Industrial Processes P. Georgilakis N. Hatziargyriou Schneider ElectricNational.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Genetic Algorithms: A Tutorial
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
UWECE 539 Class Project Engine Operating Parameter Optimization using Genetic Algorithm ECE 539 –Introduction to Artificial Neural Networks and Fuzzy Systems.
Medical Diagnosis via Genetic Programming Project #2 Artificial Intelligence: Biointelligence Computational Neuroscience Connectionist Modeling of Cognitive.
Genetic Algorithm.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Efficient Model Selection for Support Vector Machines
1 GAs and Feature Weighting Rebecca Fiebrink MUMT March 2005.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Genetic Algorithms Michael J. Watts
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Michigan State University Gathering and Timely use of Feedback from Individualized On-Line Work. Matthew HallJoyce ParkerBehrouz Minaei-BigdoliGuy AlbertelliGerd.
CITS7212: Computational Intelligence An Overview of Core CI Technologies Lyndon While.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
Classification using Decision Trees 1.Data Mining and Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Data Mining and Decision Support
Validation methods.
Authors: Soamsiri Chantaraskul, Klaus Moessner Source: IET Commun., Vol.4, No.5, 2010, pp Presenter: Ya-Ping Hu Date: 2011/12/23 Implementation.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Advanced AI – Session 6 Genetic Algorithm By: H.Nematzadeh.
Genetic Algorithms. Solution Search in Problem Space.
GENETIC ALGORITHM By Siti Rohajawati. Definition Genetic algorithms are sets of computational procedures that conceptually follow steps inspired by the.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
Hirophysics.com The Genetic Algorithm vs. Simulated Annealing Charles Barnes PHY 327.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Evolutionary Computation Evolving Neural Network Topologies.
Introduction to Machine Learning, its potential usage in network area,
Purposeful Model Parameters Genesis in Simple Genetic Algorithms
A Genetic Algorithm Approach to K-Means Clustering
USING MICROBIAL GENETIC ALGORITHM TO SOLVE CARD SPLITTING PROBLEM.
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Rule Induction for Classification Using
A Comparison of Simulated Annealing and Genetic Algorithm Approaches for Cultivation Model Identification Olympia Roeva.
Can Computer Algorithms Guess Your Age and Gender?
Gerd Kortemeyer, William F. Punch
Genetic Algorithms: A Tutorial
Predicting Student Performance: An Application of Data Mining Methods with an Educational Web-based System FIE 2003, Boulder, Nov 2003 Behrouz Minaei-Bidgoli,
Behrouz Minaei, William Punch
Genetic algorithms: case study
Review NNs Processing Principles in Neuron / Unit
Traveling Salesman Problem by Genetic Algorithm
Evolutionary Ensembles with Negative Correlation Learning
Genetic Algorithms: A Tutorial
Presentation transcript:

Mining Feature Importance: Applying Evolutionary Algorithms within a Web- based Educational System CITSA 2004, Orlando, July 2004 Behrouz Minaei-Bidgoli, Gerd Kortemeyer, William F. Punch Michigan State University

Topics Problem Overview Classification Methods Combination of Classifiers Weighting the features Using GA to choose best set of weights Experimental Results Feature Importance

Problem Overview This research is a part of the latest online educational system developed at Michigan State University (MSU), the Learning Online Network with Computer-Assisted Personalized Approach (LON-CAPA). In LON-CAPA, we are involved with two kinds of large data sets: – Educational resources: web pages, demonstrations, simulations, individualized problems, quizzes, and examinations. – Information about users who create, modify, assess, or use these resources. Find classes of students. Groups of students use these online resources in a similar way. Predict for any individual student to which class he/she belongs.

Classifiers Non-Tree Classifiers (Using MATLAB) – Bayesian Classifier – 1NN – kNN – Multi-Layer Perceptron – Parzen Window – Combination of Multiple Classifiers (CMC) – Genetic Algorithm (GA) Decision Tree-Based Software – C5.0 (RuleQuest <<C4.5<<ID3) – CART (Salford-systems) – QUEST (Univ. of Wisconsin) – CRUISE [use an unbiased variable selection technique ]

Statement of Problem(1) Our claim is that data mining can help to design better and more intelligent educational web-based environment Can help instructor to design the course more effectively, detect anomaly Can help students to use the resources more efficiently

Statement of Problem (2) Can find classes of students. Groups of students use these online resources in a similar way Predict for any individual student to which class he/she belongs Can help the instructor provide appropriate advising in a timely manner

Data Sets: MSU online courses

Extracted Features 1. Total number of attempts 2. Total no. of correct answers (Success rate) 3. Success on the first try 4. Success on the second try 5. Success after 3 to 9 attempts 6. Success after 10 or more attempts 7. Total time until the correct answer 8. Total time spent, regardless of success 9. Participation in online communication

GA Optimizer vs. Classifier – Apply GA directly as a classifier – Use GA as an optimization tool for resetting the parameters in other classifiers. Most application of GA in pattern recognition applies GA as an optimizer for some parameters in the classification process. GA is applied to find an optimal set of feature weights that improve classification accuracy.

Fitness/Evaluation Function – 5 classifiers: 1. Multi-Layer Perceptron 2 Minutes 2. Bayesian Classifier 3. 1NN 4. kNN 5. Parzen Window CMC 3 seconds – Divide data into training and test sets (10-fold Cross-Val) – Fitness function: performance achieved by classifier

Individual Representation The GA Toolbox supports binary, integer and floating- point chromosome representations. Chrom = crtrp(NIND, FieldDR) creates a random real-valued matrix of N x d, NIND specifies the number of individuals in the population FieldDR = [ ; % lower bound ]; % upper bound Chrom =

Simple GA % Create population Chrom = crtrp(NIND, FieldD); % Evaluate initial population, ObjV = ObjFn2(data, Chrom); gen = 0; while gen < MAXGEN, % Assign fitness-value to entire population FitnV = ranking(ObjV); % Select individuals for breeding SelCh = select(SEL_F, Chrom, FitnV, GGAP); % Recombine selected individuals (crossover) SelCh = recombin(XOV_F, SelCh,XOVR); % Perform mutation on offspring SelCh = mutate(MUT_F, SelCh, FieldD, MUTR); % Evaluate offspring, call objective function ObjVSel = ObjFn2(data, SelCh); % Reinsert offspring into current population [ Chrom ObjV]=reins(Chrom,SelCh,1,1,ObjV,ObjVSel) ; gen = gen+1; MUTR = MUTR+0.001; end

Results without GA

Results of using GA

GA Optimization Results

Features importance

Conclusion Four classifiers used to segregate the students. CMC improves accuracy significantly. Weighting the features and using a genetic algorithm to minimize the error rate improves the prediction accuracy by at least 10% in the all cases. In the case of the number of features is low, the feature weighting is working better than feature selection.

Contribution A new approach to evaluating student usage of web- based instruction An approach that is easily adaptable to different types of courses, different population sizes, and different attributes to be analyzed Rigorous application of known classifiers as a means of analyzing and comparing use and performance of students who have taken a technical course that was partially/completely administered via the web

Questions garage.cse.msu.edu garage.cse.msu.edu