Presentation is loading. Please wait.

Presentation is loading. Please wait.

The BOP (Billion Object Platform) and WorldMap / Dataverse Integration Harvard Center for Geographic Analysis Tuesday, July 12, 2016 Ben Lewis, Mercè Crosas,

Similar presentations


Presentation on theme: "The BOP (Billion Object Platform) and WorldMap / Dataverse Integration Harvard Center for Geographic Analysis Tuesday, July 12, 2016 Ben Lewis, Mercè Crosas,"— Presentation transcript:

1 The BOP (Billion Object Platform) and WorldMap / Dataverse Integration Harvard Center for Geographic Analysis Tuesday, July 12, 2016 Ben Lewis, Mercè Crosas, Raman Prasad

2 Billion Object Platform - funded by Sloan General purpose, open source, streaming, big spatio-temporal data exploration and extraction Performs basic sentiment analysis Runs on commodity hardware and software Built on Spatial Lucene and Solr. Exposes all functions through an API

3 Other geospatial visualization work (funded by the Boston Area Research Initiative) 1.Spatial stamping in Billion Object Platform 2.Table visualization –Tables with well defined area columns (Census codes) –Tables with lat/longs 3.Geospatial data visualization –Shapefiles

4 The “Billion Streaming Geo-tweets” dataset A new dataset type in Dataverse which supports real-time streaming and visual, interactive exploration The content is geo-tweets (tweets containing GPS coordinate from originating device). Currently 1-2% of tweets are geo-tweets, about 8 million per day. The CGA has been harvesting geo-tweets since 2012. Main components: –1) Geo-tweet harvesting and archiving system –2) software and hardware platform to support interactive exploration of a billion spatio-temporal objects. –3) API to provide query access to the archive from Dataverse. –4) client-side tools for querying/visualizing the contents of the archive, extracting subsets, pushing them to Dataverse.

5 The “Billion Streaming Geo-tweets” dataset What does a landing page look like when… –Data source is external to Dataverse –The data source is continuously being updated –The data does not consist of “files” in the traditional Dataverse sense

6 The BOP: streaming big data… A closer look at the Billion Streaming Geotweets

7 API to streaming geo-tweets Built on Solr

8 A dataset landing page which enables data exploration and extraction A client which enables interactive exploration in multiple dimensions

9 Demo of Big Data exploration using predecessors of BOP : Japan Data Archive and HHypermap Japan Data Archive http://jdarchive.org/en/search#view_type=event&media_type=&so rt=relevant& http://jdarchive.org/en/search#view_type=event&media_type=&so rt=relevant& HHypermap Distributed Archive http://hypersearch.cga.terranodo.io/maps/new http://hypersearch.cga.terranodo.io/maps/new

10 2) Table Geocoding Work funded by NSF. Goal is to enable Dataverse tables with well-known geographic encodings to be easily visualized as maps

11 Pick the “Geospatial Data Type”

12 Choose (a) WorldMap “Join Layer” & (b) File column to join

13 Table visualized

14 Apply cartographic classification

15 Map symbolized

16 Map saved back to Dataverse

17 Thank You Ben Lewis blewis@cga.harvard.edu

18 Phase II? Use Polygons to Symbolize Big Data Perform big data query. Find 10 million tweets mentioning Brexit. 18

19 ( Geographic region and sentiment stamping ) Geographic stamping: As tweets stream in they will be stamped with census block, census tract, and Admin 2 codes. –To support aggregations by census or admin as well as by heatmap grid. Sentiment stamping: As tweets stream in a basic attempt will be made to determine sentiment. –To support heatmaps representing average sentiment values as well as count values.

20 Geo-tweet Dataverse https://dataverse.harvard.edu/dataverse/geo-tweets


Download ppt "The BOP (Billion Object Platform) and WorldMap / Dataverse Integration Harvard Center for Geographic Analysis Tuesday, July 12, 2016 Ben Lewis, Mercè Crosas,"

Similar presentations


Ads by Google