Presented by Khawar Shakeel

Slides:



Advertisements
Similar presentations
Predicting Students Drop Out: a Casestudy Gerben Dekker, Mykola Pechenizkiy and Jan Vleeshouwers.
Advertisements

The role of Domain Knowledge in a large scale Data Mining Project Kopanas I., Avouris N., Daskalaki S. University of Patras.
Predictors of Programming Performance in the Successor Course Marija Brkić Bakarić, Higher Teaching Assistant Maja Matetić, Associate.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Introduction to machine learning
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
ICT TEACHERS` COMPETENCIES FOR THE KNOWLEDGE SOCIETY
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
1.Study design  Cross sectional study. METHODOLOGY.
Evaluation and analysis of the application of interactive digital resources in a blended-learning methodology for a computer networks subject F.A. Candelas,
Business Intelligence, Data Mining and Data Analytics/Predictive Analytics By: Asela Thomason IS 495 Summer 2015.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Teaching experience at the FIB. What is Data Mining? A broad set of techniques and algorithms brought from machine learning and statistics.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N ational Q ualifications F ramework N Q F Quality Center National Accreditation Committee.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Pattern Recognition Lecture 20: Data Mining 2 Dr. Richard Spillman Pacific Lutheran University.
4/16/07 Assessment of the Core – Quantitative Reasoning Charlyne L. Walker Director of Educational Research and Evaluation, Arts and Sciences.
Advanced Software Engineering Dr. Cheng
Machine Learning: Ensemble Methods
Introduction to Marketing Research
Machine Learning with Spark MLlib
Queensland University of Technology
Introduction to Research Methodology
DATA COLLECTION METHODS IN NURSING RESEARCH
Data Mining in Higher Education
MIS2502: Data Analytics Advanced Analytics - Introduction
F. Diko 1 , Z.Alzoabi 1, M. Alnoukari 2
Statistical Process Control
DATA MINING © Prentice Hall.
Oleh: Beni Setiawan, Wahyu Budi Sabtiawan
Prepared by: Mahmoud Rafeek Al-Farra
Efficient Image Classification on Vertically Decomposed Data
A Methodology for Finding Bad Data
Project Work - Topic Selection
Introduction to Research Methodology
Curriculum in Statistics at the University of Oviedo
Components of Annual Plan & Designing a Learning Unit Plan
Efficient Image Classification on Vertically Decomposed Data
Data Mining Practical Machine Learning Tools and Techniques
Moving on to the Secondary School
Lecture 1: Course Outline and Introduction
A Modified Naïve Possibilistic Classifier for Numerical Data
CSE591: Data Mining by H. Liu
Prepared by: Mahmoud Rafeek Al-Farra
Introduction to Data Mining, 2nd Edition
Third Year Options Meeting BSc. Economics.
iSRD Spam Review Detection with Imbalanced Data Distributions
Classification and Prediction
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Ensemble learning.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Unit 7: Instructional Communication and Technology
CRISP Process Stephen Wyrick.
Biological Science Applications in Agriculture
Data Pre-processing Lecture Notes for Chapter 2
CSE591: Data Mining by H. Liu
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

Presented by Khawar Shakeel Educational Data Mining to inspect low performance academic areas of the students using ensemble classification Presented by Khawar Shakeel Khawar Shakeel Naveed Anwer Butt Department of Computer Science Department of Computer Science University of Gujrat, Pakistan University of Gujrat, Pakistan Email: khawarshakeel@gmail.com Email: naveed@uog.edu.pk

Outline Introduction Design Goal Related Work Design Approach What is Educational Data Mining (EDM)? What problems can we solve using EDM? Stakeholder Design Goal Related Work Design Approach Experimental Results Suggestions & Future Work

Educational Data Mining (EDM) - Introduction Data Mining (DM) ? Data mining is a method to identify the hidden details from the huge volume raw data; such methods are applied when data is outsized and less knowledge about data available. The Educational Data Mining is currently a growing research area of Data Mining (DM) based on statistical methods for educationally linked data in order to improve the system and quality of higher education institutions.

Possible Questions to be solved by EDM How to predict students learning behavior? How to group up the students according to their interests? What are the strong and weak areas of studies of students? How to identify the students needing more help? Which group(s) of students likely to be dropped or promoted? What kind of educational resources need to be allocated? and why?

EDM- Stakeholder Administration Administrators use EDM to make sure the allocation of the useful resources for the betterment in institutional education, Faculty and advisors are becoming more proactive in identifying and addressing at risk students. Educators Educators attempt to understand the learning process and the methods they can use to improve their teaching methods. Researchers Researchers focus on the development and the evaluation of data mining techniques for effectiveness.

Our study- Design Goal Design a predictive model capable of To explore the reason(s) of poor performance of majority of the students in some specific course(s) or domain in order to intimate the administration for necessary actions need to be taken accordingly. Main tasks are Extraction of predictable attributes from the data source. Identification of different attributes that may determine learning behavior of the student. Construction of prediction model based on selected predictable variables using different existing ensemble classification algorithms. Report to administration about the findings.

Previous Work Although, data mining in education is not a mature field but there are a lot of work has done in this area. That is because of its prospective to educational establishment.

Previous Work

Previous Work

Design Approach - Data Collection Student information System of university - Data Pre-processing Selection Cleaning Transformation - Development of model based on Ensemble classification algorithms Bagging Boosting (J48 Decision Tree algorithm as base classifier) - Useful patterns leading better decision making

Proposed Design Overview

Design Approach –Data Collection Secondary data is collected initially through the Semester system using University Information System. Targeted Students are from Master’s and BS (HONS) degree programs registered in different departments of all faculties of a Public Sector University.

Design Approach –Data Introduction The attributes from the data need to be examined are students marks of each category like assignment, quiz, presentation, midterm, final subjective final objective for courses. The final data for model included 3130 instances and 7 variables.

Design Approach – Dataset

Design Approach–Data Preprocessing Data Selection Two departments from each faculty are selected. The extracted data is from batches of years 2008 & 2009 of BS (HONS) and 2010 & 2011 of master degree programs. Only academic activity values are being recorded as variable, ignoring student’s other information like demographical and finance etc. Data Cleaning The record of student(s) having missing marks in any exam category of any course is being cleaned because it can leads to bias decision sometime.

Design Approach– Data Preprocessing In university, there are many courses are being taught i.e. general courses, elective courses, compulsory courses and core courses. When we talk about grading of a courses means where the obtained marks falls in grading ranges. Here we are considering C, D, and F grades as low grades where the marks tend to less than 60. These said grades consider low because these clue to affects the GPA of student negatively. The data of courses having higher percentage of low grades are selected for analysis, For the selected courses, the data of all students is collected.

Our Approach –Ensemble Classification

Our Approach –Ensemble Classification Ensemble classification techniques based on the method of combining the classifiers in order to acquire the reliable results. The most common model combining approaches that exist in the data mining, are Bagging and boosting. Bagging technique has a voting structure in which n models, generally of same nature, are built. For an unidentified instances, each model’s predictions are verified. That class is given which is contribution the majority vote between the predictions from models. Boosting technique has is almost same to bagging in which only the model building stage changes. Here the instances which are repeatedly misclassified are permitted to contribute in training added amount of times. There are normally n classifiers which having distinct weights for their accuracies. As a final point, that class is given which to having maximum weight.

Results – Basic statistical Facts

Academic Facts

Results – Technical Facts

Results – Statistical Summary

Technical Bottom line The results stating the facts that the boosted tree performs outclass than bagged tree comparatively when the standard deviations are in higher range but also when the data size is small at the same time.

Suggestions Summarizing these facts, it is concluded that students need to improve their subjective approach like in “Mid Term” and “Final Subjective” in order to have “High Grades” and promoted to the next semester. And there should be balance in evaluation of different level of students.

Future Work As for future work, some other factors related to our research questions will be included like some financial and behavioral factors that may lead to better classification and to answer some new real time questions from educational environment. The more facts can be catch by enlarging the dataset and to include some other variables routing to new directions in decision making. Also some other mining techniques can be applied to discover some realities other than classification nature.

 Thank You. khawarshakeel@gmail.com My presentation is over. Thank you very much. Do you have any question?