Download presentation
Presentation is loading. Please wait.
Published byRosamund Walsh Modified over 8 years ago
1
Data Mining Project Presentation Group A Saurav Das Guanghao Lin Yi-Chiang Lin Sameer Patil
2
We Are Group A (1st Present Team in 1st Day) Saurav Das 1 st MBA MIS Sumeer Patil 1 st MBA MIS Guanghao Lin 2 nd MBA Marketing & MIS Yi-Chiang Lin 1 st MBA Finance & MIS
3
Agenda Project introduction Datasets Methodology Results Next Steps
4
About Yelp Yelp was founded in 2004 to help people find great local businesses like dentists, hair stylists and mechanics Yelp had an average of approximately 135 million monthly unique visitors in Q4 2014. Yelpers have written over 71 million local reviews. Yelp makes money by selling ads to local businesses To connect people with great local businesses
5
Yelp’s Business Structure Customers Local Business
6
For Yelp Which business category user review most/least? When is the peak time of business? Problems For Local Business If I plan to open a business, Who are the most valuable customers ? Where should I open my business us?
7
Datasets Research Datasets Business Each row is a local business Example: 11,537 Attributes: 13 Check In Each row is a check-in info for a local business Example: 8,282 Attributes: 3 Review Each row is a review info of one customer to a business Example: 229,907 Attributes: 10 User Each row is a user Example: 43,873 Attributes: 8
8
Regression Mapping Filter examples Local business in California (outliers) Delete non-value attributes Neighborhood attribute (all “0”) in business dataset Replace missing value No review text in review dataset Clustering Classification Process (Need to confirm) Data Preparation Data Mining Data Analysis
9
Business Question #1 Who are the most valuable customers?
10
Most Valuable Customers Methodology: Clustering
11
Most Valuable Customers However, Are they really matter to local business? Are their reviews affect the business star rating?
12
Regression Results
13
Business Question #2 Location, Location, Location?
14
Business Location Selection Methodology: Clustering
15
Data Mining Process Finding out the local business in Phoenix AZ. Clustering data by using K- Means (150) cluster, based on “review_count”, “stars”, “latitude”, and “longitude” Sample Size: 3372
16
Business Question # 3 When is the peak time of business?
17
Method Used: Clustering Number of records: 8283 Tested Attributes: 150+ Attribute Name NoteExample Business CategoryType of business Food, Bank, Restaurant etc Stars Rating given by Customer Busienss ID Check in Info Day and Time when user review the business. Business Name
18
Process 1: Preprocessing Process 3: Clustering Process 2 : Cleaning and Selecting Business Categories (Using excel)
19
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.