Presentation is loading. Please wait.

Presentation is loading. Please wait.

Getting Started on Hadoop Part 3: Visualize with Datameer

Similar presentations


Presentation on theme: "Getting Started on Hadoop Part 3: Visualize with Datameer"— Presentation transcript:

1 Getting Started on Hadoop Part 3: Visualize with Datameer
June 18th, 2015

2 Agenda Time Description 5:30pm – 6:00pm
Software Install & Data download / preparation 6:00pm – 6:20pm Dinner, Welcome and 2015 Roadmap 6:20pm – 6:30pm Datameer Intro / Overview 6:30pm – 6:45pm Part II Review and Demo 6:45pm – 7:15pm Datameer Activity & Collaboration 7:15pm – 7:45pm Q/A & Networking

3 2015 Roadmap April Meetup Recap 2015 Agenda
Dataset identification; London Data Store Real-time query execution in Hive and Pig Visualization via Excel Power View 2015 Agenda Academic Discussion Proof of Concept (“Real World” Example) Vendor Demonstration

4 Datameer’s Vision

5 Big Data Ecosystem

6 Why Analytics? Data-driven companies worldwide improve their marketing ROI by 15-20% which adds $150-$200 billion in additional value - McKinsey

7 The Analytics Problem

8 Fastest Time to Insights

9 End-to-End Big Data Analytics for Hadoop

10 Datameer Solution Stack

11 Enterprise Integration

12 Why Datameer?

13 Dataset Selection At least one dataset with approximately 1 million rows Additional datasets with at least 1 common column Analysis: Relationship, Correlation, Descriptive/Diagnostic Bob Puliam – thanks for the data source

14 Selected Datasets – London Data Store
Home Sales Prices Cost of Business Space Job Density Crime Rate Common Column: Borough Range: London Data Store: 583 total – Demographics, Employment/Skills, Transport, Housing, Health, Transparency, Education, Environment, Business and Economy, Planning London Greenbelt Landmarks: Windsor, Chelsea, Westminster, City, Hackney = Silicon Roundabout, Tower Hamlets = Canary Wharf Use Case Ideas: Business & Crime  Cost of business space v. crime rate (as crime rate increases, cost of space decreases) Business & Job Density  Higher job density = higher price per square meter Job Density & Crime  Inverse relationship = when job density is up, crime is down, when job density is down, crime is up Home Prices  Price, New Build, Freehold Business: price per square meter Job Density: number of jobs Crime: Crime rate

15 Housing Turnover London Housing Prices
Look at house sales to determine whether new build or not Break it by year to see trends Visualization

16 HDP v. Datameer Upload Create Workbook Add Data Add Sheets
Upload Data Sets Define Schema Hive Query Export to CSV Import CSV Create Pivot Table Create Excel Graph Upload Create Workbook Add Data Add Sheets Run Workbook Create Infographic Discuss: Possible data sets and how London House Prices and Crime Rates are a good way to start. If we look at the trends of London House Prices and Crime Rates we can find when they intersect and look at everything 30% greater and less than that point. Use Case Ideas: Business & Crime  Cost of business space v. crime rate (as crime rate increases, cost of space decreases) Business & Job Density  Higher job density = higher price per square meter Job Density & Crime  Inverse relationship = when job density is up, crime is down, when job density is down, crime is up Home Prices  Price, New Build, Freehold Business: price per square meter Job Density: number of jobs Crime: Crime rate HORTONWORKS HDP SANDBOX Upload datasets to HDP Define schema, create table in Hcatalog Hive Query: select newbuild, count (newbuild) newbuildcount, year from lhouseprices group by year, newbuild Export results to CSV from HDP Import CSV to Excel Create Pivot Table with data Create graph in Excel DATAMEER File Upload to Datameer Create Workbook Add “londonhome” data to Workbook Add sheet, GROUPBY year, GROUPBY whether newbuilt, GROUPCOUNT Run Workbook Create Infographic

17 Hands-on Activity Your turn! Reference the handout for guidance
Collaborate with those around you

18 Q&A Questions Answers


Download ppt "Getting Started on Hadoop Part 3: Visualize with Datameer"

Similar presentations


Ads by Google