CALIFORNIA STATE UNIVERSITY, SACRAMENTO

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Market Research Ms. Roberts 10/12. Definition: The process of obtaining the information needed to make sound marketing decisions.
By Klejdi Muca & Stephen Quinn. A method used by companies like IMDB or Netlfix to turn raw data into useful information, for example It helps companies.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Introduction to Research Methodology
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
Improving the Accuracy of Continuous Aggregates & Mining Queries Under Load Shedding Yan-Nei Law* and Carlo Zaniolo Computer Science Dept. UCLA * Bioinformatics.
Data Mining – Intro.
ROLE AND IMPORTANCE OF MARKET RESEARCH  Process of collecting and analyzing data for a good/service in a market  Analyzing consumer reaction to eg.
Proposal in Detail – Part 2
Steps in a Marketing Research Project
Web Usage Mining Sara Vahid. Agenda Introduction Web Usage Mining Procedure Preprocessing Stage Pattern Discovery Stage Data Mining Approaches Sample.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
VIRTUAL BUSINESS RETAILING
Repository Method to suit different investment strategies Alma Lilia Garcia & Edward Tsang.
1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Lecture 9: Knowledge Discovery Systems Md. Mahbubul Alam, PhD Associate Professor Dept. of AEIS Sher-e-Bangla Agricultural University.
Recommendation system MOPSI project KAROL WAGA
Market Research Lesson 6. Objectives Outline the five major steps in the market research process Describe how surveys can be used to learn about customer.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
ITS-VIP SPRING 2012 FINAL PRESENTATION DATA MINING GROUP PHP?HTML INTERFACE Mide Ajayi Nakul Dureja Data Miners Rakesh Kumar David Fleischhauer.
Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.
Data Mining By Dave Maung.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Vision + Focus + Execution Meiliu Lu, RVR 5016, For CSc 209 Spring 2003, 5/6/03.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Pin-Yun Tarng / An Analysis of WoW Players’ Game Hours Network and Systems Laboratory nslab.ee.ntu.edu.tw IEEE/IFIP DSN 2008 Network and Systems Laboratory.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Profiling: What is it? Notes and reflections on profiling and how it could be used in process mining.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Data mining in web applications
Market research THE TIMES 100.
Data Collection Techniques
Collage Score Card & Software defect prediction
Data Mining – Intro.
OPERATING SYSTEMS CS 3502 Fall 2017
By Arijit Chatterjee Dr
Presented by Khawar Shakeel
Market Intelligence Analysis
A Methodology for Finding Bad Data
Statistical Process Control (SPC)
Complete Buying Process
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Predict House Sales Price
4.1.
Handling Data Using Databases
What does science mean to you?
Data Science introduction.
iSRD Spam Review Detection with Imbalanced Data Distributions
Statistical Reasoning
Course Introduction CSC 576: Data Mining.
Statistics · the study of information Data · information
Intro to Machine Learning
Nearest Neighbors CSC 576: Data Mining.
Data Warehousing Data Mining Privacy
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Happiness and Stocks Ali Javed, Tim Stevens
Credit Card Fraudulent Transaction Detection
Austin Karingada, Jacob Handy, Adviser : Dr
Presentation transcript:

CALIFORNIA STATE UNIVERSITY, SACRAMENTO PRESENTATION on RESEARCH PAPER AND PROJECT PROPOSAL Presented by DEVESH BINJOLA NICHOLAS HANKS Under the Guidance of Dr. Meiliu Lu Dept of Engineering And Computer Science

A RESEARCH PAPER ON Mining Game Statistics from Web Services: A World of Warcraft Armory case study Chris Lewis - PhD, Computer Science Noah Wardrip-Fruin - Professor of Computer Science University of California, Santa Cruz

ABSTRACT Collecting quantitative data sets from publicly available "web services" (in this case, the WoW Armory website) and mining the data for qualitative study This study analyzed the following: (1) the ability of the constructed classifier to accurately predict character class based on equipment (2) the correlation between the time it takes players to reach the maximum level (80) and their chosen character class (3) the correlation between the number of deaths of a character and its class (4) the most popular items equipped among the sample

Process and Results To gather data, develop and implement a web crawler (WoWSpyder) to crawl WoW Armory website and download character data into database Filter out characters with errors (detected through “impossible” stats, such as negative deaths) Total sample size: 136,047 characters from the US and European servers

Process and Results Naive Bayes classifier, 10-fold cross validation - 94.70% accuracy Confusion Matrix Purpose: discover deviations from designed “norms” in character equipment (class-targeted equipment)

Process and Results Calculate time to reach maximum level (80) from level 10 for each character Construct box plot - Time (days) to Level 80 vs. Class Analysis suggests there is no significant advantage/disadvantage among character classes

Process and Results Construct box plot - Number of Deaths vs. Class Analysis suggests there is imbalance among character classes

Process and Results Construct list of most popular items (top 5 shown) Graph Item Popularity vs. Number Wearing Analysis suggests there is a trade-off between item power and rarity

Limitations Web service query processing limitations (crawler restrictions) Workaround: multiple threads, built-in delays and error handling Financial limitations ($ no funding $) Technological limitations (1 machine collecting data) Data limitations (limited sample size, breadth of character data) Trimmed data records, focus on pertinent attributes

OPINION ON RESEARCH PAPER A good study on how to gather and mine video game data from widely available web services. Video game player data can be difficult to obtain for casual researchers and those without industry connections. Given adequate resources, this study could be scaled to handle a larger volume of data, as well as more investment into discovering other problems that could be solved using data mining techniques.

DATA MINING PROJECT PROPOSAL “STOCK MARKET PREDICTION USING DATA MINING” PROJECT TEAM : DEVESH BINJOLA NICHOLAS HANKS

OBJECTIVE The goal of this data mining project is to predict the behavior of the stock market and to trace out the trend such as high price, low price, volume etc. To understand these large data sets we make use of clustering algorithms. The algorithm used here is K-means. This project’s aim is to find the best cluster since cluster points change whenever the application is executed. This is done by using the multiple random normalizations where it chooses the best initial clusters.

SCHEDULE Total working of data mining may take up to a week. The tentative schedule would be like as follows. 1.Stock data collection – 2 days 2.Implementing the algorithm – 2 days 3.Analysis – 2 days

ALGORITHM USED The algorithm used here is K-means. This project’s aim is to find the best cluster since cluster points change whenever the application is executed. This is done by using the multiple random normalizations where it chooses the best initial clusters.

DATA WAREHOUSING Project Proposal “DATAMART FOR GOLF DATABASE ” PROJECT TEAM : DEVESH BINJOLA NICHOLAS HANKS

OBJECTIVE Main objective of this data warehousing project is to implement data mart which can help in searching the golf history from its beginning to till date(men’s golf history). This data mart can help to provide answers for the queries based on the database which are derived from the golf dataset: The queries we are planning to implementing are: Winner of majors by year Count of majors by player name Count of majors by major name Winner of majors by major name

SCHEDULE Total working of data ware housing may take upto 2 weeks. The tentative schedule would be like as follows. golf data collection – 5 days Data cleaning and preprocessing – 4 days Implementation using PHP, MySQL – 5 days

REFERENCES WIKIPEDIA http://www.databasegolf.com/