Problem Command line features of MATLAB environment Absence of the tool including many data mining functions together. Hardness of using command line for novice users Need for developing interfaces for data mining functions
Solution Designing a data mining environment within MATLAB that combines many data mining functionalities Using GUI Design Environment (GUIDE) of MATLAB for interface design.
The Project This study is the continuation of the last year student project “Developing Data Mining Platform” In this project, data mining functions added to MATLAB is transformed to graphical user interfaces and provided usage of all these functions from interfaces
Methodology CRISP- DM Methodology Data Understanding Data Preparation Modeling Evaluation
MATLAB Environment High-level language for technical computing Development environment for managing code, files, and data Interactive tools for iterative exploration, design, and problem solving Mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimization, and numerical integration 2-D and 3-D graphics functions for visualizing data Tools for building custom graphical user interfaces Functions for integrating MATLAB based algorithms with external applications and languages, such as C, C++, Fortran, Java, COM, and Microsoft Excel
Menu Structures File Read Read_From_File Read_From_ODBC Save Save As... Exit
Menu Structures-File Read_From_File Retrieves data from text files and writes to spreadsheet format of the tool Read_From_ODBC Retrieves data from a data source via an ODBC driver and writes to spreadsheet format of the tool.
Menu Structures Data Run Matlab Command Create List Add List Remove List Set Meta Add Meta List Meta Descriptives
Menu Structures- Data Run Matlab Command Works as MATLAB Command Window Create List Creates variable lists for data mining funtionalities Add List Adds new variable names to a variable list and merge lists. Remove List Remove variable names from a variable list
Menu Structures- Data Set Meta Sets metadata value of a variable Add Meta Add new values to the metadata of a variable List Meta Shows the the values and their metadata values Descriptives Displays statistics of selected variable.
Menu Structures Preparation Missing_Value Sampling Transformation Discretization
Menu Structures- Preparation Missing_Value Replaces the missing values of variables or removes rows according to number of missing values in the row Sampling Selects samples from specified data set with selected sampling method Transformation Transforms the columns into specified ranges Discretization Transforms the data into dicrete values according to given intervals
Menu Structures Functionality Association Classification Clustering Regression
Menu Structures- Functionality Association Extracts association rules from specified data set. Classification Uses a neural network, finds errors and returns the trained network and errors within a structure. Supports cross validation and bootstrap tehniques. Clustering Makes a K-means clustering and finds distances between clusters and the size of clusters Regression Applies multiple linear regression Finds beta values and errors and returns the beta values of the regression model and errors within a structure. Supports cross validation and bootstrap techniques.
Conclusion The user interfaces designed for data mining functions. This study handles some pre-processing functions and data models, like association different from previous work. It provides visuality to data mining functions and increasing user flexibility with embedding different data mining functions and models into the tool.
Recommendations Association tool can be embedded to the tool with modification and other data models and data mining functions can be extended The report capabilities of the tool can be improved and the functions and reports can serve from internet by using web services.
Thank you...