Stephan Nathanael Mgaya WEKA - Machine Learning On Science Gateway Stephan Nathanael Mgaya TERNET
Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 WEKA - Machine Learning On Science Gateway - Intro Introduction WEKA is a free software with a collection of tools and algorithms for Data Analysis and Predictive modelling WEKA supports several standard mining tasks, Data pre-processing and feature selection Clustering, Classification and Regression Visualization. WEKA is very useful tool to experiment, test and compare performance of various data mining algorithms. Aim Porting WEKA to run on the Science Gateway Develop web interface to access and use WEKA features Test WEKA web interface using Breast Cancer Use Case. Benefits Access to high computing power No need to install WEKA on user computer User friendly web interface and visualization VNC Session Limitations WEKA is widely used as a desktop application. Its performance is limited by the host computer resources. Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 @ei4africa l #SciGaIA e-Infrastructures for Africa Community This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n°654237
Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 WEKA – What have been done What have been done Learn future gateway architecture during hackfest italy Develop portlet user support in github. Test WEKA web interface using Breast Cancer Use Case. Add the portlet to the African Science Gateway. Add more algorithims on the portlet Run some tests. VNC Session Classifiers Added Support Vector Machines (also called SVM): functions.SMO k-Nearest Neighbors (also called KNN: lazy.IBk Decision Tree (specifically the C4.5 variety): trees.J48 bayes Naive Bayes. Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 @ei4africa l #SciGaIA e-Infrastructures for Africa Community This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n°654237
Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 WEKA - Architecture Architecture of Future Gateway .A framework to build science gateways Portlet Contain GUI for SG applications. Grid&Cloud Engine (GridEngine) System to manage jobs running on Grid and Cloud infrastructures. SAGA/JSAGASAGA OGF standard to manage jobs on distributed infrastructures JSAGA a Java implementation of SAGA. REST APIs Communication protocol over HTTP to manage network applications. APIServer The engine implementing RESTAPIs. VNC Session Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 @ei4africa l #SciGaIA e-Infrastructures for Africa Community This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n°654237
Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 WEKA - Advantages Advantages Widely access Same result Same edu id High performance Re usability User friendly VNC Session Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 @ei4africa l #SciGaIA e-Infrastructures for Africa Community This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n°654237
Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 WEKA - End Thank you VNC Session Sci-GaIA Workshop , Dar es Salaam (Tanzania), February , 3, 2017 @ei4africa l #SciGaIA e-Infrastructures for Africa Community This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement n°654237