MCMS Mining Course Management Systems Samia Oussena Thames Valley University Samia.oussena@tvu.ac.uk
Project Aim MCMS is a JISC funded project which aims to use data mining to support TVU strategy on students retention and course monitoring.
Project Overview BB etc. MCMS student information data sources events records etc. MCMS student information data sources events reports advice MCMS monitors student information sources and generates events, reports, advice etc. to identify potential divergence from and to recommend remedial actions for prescribed educational processes. educational process descriptions
Project Objectives Conduct a detailed survey of the stakeholder’s main areas of concerns and good intervention practices. Conduct a data analysis of the institution database systems relevant to the problem areas. Propose and implement a data integration model. Build and evaluate data mining models. Build an application that will use the data mining model and implement intervention requirements.
Collecting Data Sources Process Collecting Data Sources Data Integration Data Mining Student Intervention
TVU Data Sources UNIT-E TALIS-LIST MSG BLACK BOARD PROGRESS FILE PS Student Background Profile, Course/Module, Enrolment, Assessment Results Reading List Loan History Module Profile UNIT-E TALIS-LIST MSG Student Online Activities, Module Online Content Size Student Basic Skills on English, Math and IT Course Profile BLACK BOARD PROGRESS FILE PS E-Resource Access Log Student Loan History Course Offering Details TALIS Marketing System Shibboleth
MCMS Data Warehouse
Model Driven Data Merging Data Source Merging Data Target Meta Model AWM (ATLAS Weaving Model) Logical Model UML Based DataSource Model UML Based Merging Model UML Based Integrated Model Physical Model OWB TM (Oracle Warehouse Builder Transformation Model) Flat Files /DB Data -DB Model -OWB TM DB Model Real Code SQL Loader PL/SQL - DDL - DDL
Design of the course and module cubes Course Cube Module Cube
Example of a Cube Sample Query Results Dropout Rates Study Mode School Year Semester Study Mode School
Data Mining Process Transfer data to fit the data mining models first. Apply feature importance and associate rules to find the relation among data features. Then classify data and extract human friendly rules and patterns. Regression is then applied to predict future behaviours. Pre-Processing Feature Importance/ Associate Rules Classification/ Clustering Regression/ Prediction 1. Pre-Process the data 2. Find feature relations 4. Predict feature behaviours 3. Group data and extract possible rules
Data Mining Pre-Processing Summarize data on different levels (e.g. overall module average mark , total number of resits, total book loans and etc) Discard Short Courses data (150 courses 100) map the entry Certificate into numeric value
Finding relations: Student Data “Is the student performance, such as average mark, drop out, pass/fail related to student background profile?” “Is the student performance, such as average mark, drop out, pass/fail related to Blackboard System and Library Usage?”
Finding Relations : student data Student performance is not related to the gender, race, age, disability, nationality etc. But is related to which year he/she is studying (Current_StudyYear), BlackBoard Usage (BB_Usage) and slightly related to Library Usage (Library_Usage) However, the frequency of BB access is not related to the student academic performance. Even for the same module, there are students with very high marks that use BB very rarely, whereas some frequent users have very low marks..
Finding Patterns: student data “Do part time students behave differently from full-time students?” Part time students enroll with higher certificate, get higher mark, have less resit, dropout less, but use library and BB less.
Prediction Result “Will Student A drop out or not?” Naïve Bayes The Naive Bayes algorithm is based on conditional probabilities. It uses Bayes' Theorem( which expresses the posterior probability of a hypothesis in terms of the prior probabilities)
Conclusion and Future work The JISC funded MCMS project at Thames Valley University aims to apply Data Mining technology to institution data sources in order to identify predictive rules that can be used to detect and improve issues related to student retention The project has addressed data integration issues including technical, organizational and legal issues The project built and evaluated data mining models that identify student patterns and would predict behaviour Future Work: Build a personalised intervention system Run a pilot in the next academic year