Asst. Prof. Sotarat Thammaboosadee, Ph.D. EGIT532- Data Science and Big Data Analytics Individual Project Specification Asst. Prof. Sotarat Thammaboosadee, Ph.D.
Project Individual Project. Submit report in pdf via email. zotarat@gmail.com Before 12 May 2019 Email subject: project-61xxxxx
Topics Problems Source of Data Data Mining Tasks Data Mining Process Business understanding Data understanding Data preprocessing Model building Model Evaluation Deployment
Problems What are the motivations to apply data science with your data?
Source of Data Any data sources At least 10,000 examples At least 8 attributes or text data But if you take more concentration for this stage, it may be a part of your thesis/thematic paper.
Data Mining Tasks Classification Clustering Association Etc…. What? Why? Association Etc….
Business Understanding Provide some paragraph to introduce your work.
Presentation Please provide one or more flow chart of your data mining process. You may capture the Rapidminer workflow Please rename each box in a meaningful name
Data Understanding Type of each attributes Example data set Meaning? Statistical report Data profile Visualization
Data preprocessing More than one method State the reason why you choose them. Data visualization or profiling of each processing step
Model building More than one algorithm Maybe several algorithm in one model, depend on your design More (reasonable) complex process will get more points
Model Evaluation Compare between each preprocessing method and each algorithm Select appropriate criteria
Deployment What do you obtain from the results? Using visualization Knowledge Application Policy Etc…