Download presentation
Presentation is loading. Please wait.
Published byRobert Gilbert Modified over 7 years ago
1
George Percivall Delft, The Netherlands 22 March 2017
Setting the stage: Outcomes of Location Powers Big Geo Data September 2016 George Percivall Delft, The Netherlands 22 March 2017
2
Use Cases for Big Geo Data
High Velocity Ingest Geospatial Databases Entity-oriented Spatial-temporal analytics Grid-oriented Spatial-temporal analytics Feature Fusion GeoAnalytics, Machine Learning Remote sensed data processing Machine Learning Spatial Modeling IoT Message Streaming Built environment models Array databases Users and consuming apps Social Media Message Processing Observation Sources NoSQL databases Integrated environmental models ETL Stream processing using RDF Graph databases Modeling and simulation Wide Area Motion Imagery SQL databases
3
Location Powers: Big Data - 20 Sep 2016
“The Location Powers: Big Data workshop advances discussion of loosely-coupled petabyte scale archives based on open standards results in rapid geospatial information product creation at any scale.” - Dr. Walter Scott, OGC Board of Directors, and DigitalGlobe CTO
4
Location Powers: Big Data, 20 Sep 2016
Analyzing Big Data: Dan Getman, DigitalGlobe Peter Baumann, rasdaman Akinori Sahara, Hitachi Rose Winterton, Pitney Bowes Using Big Data: Applications Shaowen Wang, NCSA/UIUC Charlie Greenbacker, In-Q-Tel Lea Shanley, South Big Data Hub at RENCI Jeff de La Beaujardière, NOAA Keynote: Jibo Sanyal, ORNL Obtaining Big Data Geoffrey Fox, Indiana Univ Jeff Walter, NASA Maintain and Access Data: Keith Hare, JTC1 SC32 SQL Glenn Guempel, USGS Rob Emanuele, Azavea Copyright © 2017 Open Geospatial Consortium
5
Big Data: Driving Forces
09/20/2016 Copyright 2016, JCC Consulting, Inc. Big Data: Driving Forces Inexpensive storage of large volumes of data Inexpensive compute power Next Generation Analytics Moving from off-line to in-line embedded analytics Explaining what happened Predicting what will happen Operating on Data at rest – stored someplace Data in motion – streaming Multiple disparate data sources Look at available data and wonder what answers are hidden there Slide Source: Keith W. Hare, JCC Consulting, Inc. & Convenor ISO/IEC JTC1 SC32 WG3 Database Languages
6
Copyright 2016, JCC Consulting, Inc.
09/20/2016 Copyright 2016, JCC Consulting, Inc. “Big Data” Data Types Traditional Data Types Character Numerical Date/Time/Timestamp Large Objects – LOB/BLOB/CLOB “Big Data” Data Types Multi-dimensional arrays Images/video Documents Loosely formatted data Objects Spatial Slide Source: Keith W. Hare, JCC Consulting, Inc. & Convenor ISO/IEC JTC1 SC32 WG3 Database Languages
7
Big Data, Take One 100 petabytes of satellite imagery
~30 gigabytes per satellite image Billion dollar satellites Covering the globe annually Expensive… Slow… Heavy… Big… Data 100 PB is a lot: big tech companies like Facebook have order of magnitude more data. But our data is HEAVY. 30 gigs per image. we’re not talking tweets that you can map and reduce all over the place with traditional big data systems. We’ll need new approaches. This is the opposite of realtime, fast, scalable, & cheap. This is old school DG Slide Source: Dan Getman, DigitalGlobe #LPBigData
8
Big Data, Take Two All the Buildings, Planes, Cars, Boats, Roads, and Things All the Changes What does it all Mean… It means coordinating between Heavy/Slow Big Data and Light/Fast Big Data It means making image analysis faster, more dynamic, and less intrusive to Knowledge Discovery All that being said, we need to be ready to support the rest of the big data folks Folks who don’t really care about imagery, they just want the knowledge This is new school DG Important to note that we are dealing with many of the problems that plague big data analysis Slide Source: Dan Getman, DigitalGlobe #LPBigData
9
Earth Server: Datacubes At Your Fingertips
Intercontinental initiative: EU + US + AUS started 2011 Agile Analytics on 3D, 4D Earth & Planetary datacubes Rigorously standards: OGC WMS + WCS + WCPS EU rasdaman + US NASA WorldWind 100s of TB sites now, next: 1+ PB Uni Jacobs PML NCI Australia ESA MEEO MWF EC
10
Science & GIS Tool Interfacing
General-purpose scientist tools: Java, C++ python, R (under work) Geo tools: MapServer, GDAL, QGIS, OpenLayers, Leaflet, NASA WorldWind, ... OGC WCS Core & INSPIRE WCS Reference Implementation Can interface to all tools supporting OGC‘s „Big Geo Data“ standards suite
11
Standards: ISO Array SQL
[SSDBM 2014] create table LandsatScenes( id: integer not null, acquired: date, scene: row( band1: integer, ..., band7: integer ) mdarray [ 0:4999,0:4999] ) select id, encode(scene.band1-scene.band2)/(scene.nband1+scene.band2)), „image/tiff“ ) from LandsatScenes where acquired between „ “ and „ “ and avg( scene.band3-scene.band4)/(scene.band3+scene.band4)) > 0
12
Apache Projects Geospatially Enabled*
*not exhaustive Slide Source: Rob Emanuele, Azavea #LPBigData
13
Slide Source: Rob Emanuele, Azavea #LPBigData
14
The Land Change Monitoring Assessment and Projection (LCMAP) information system
Slide Source: Glenn Guempel, USGS #LPBigData
15
Where We Want To Be Download as Last Resort Mentality
The Land Change Monitoring Assessment and Projection (LCMAP) information system Where We Want To Be Download as Last Resort Mentality Store data in unzipped, optimal formats ready for direct processing by standard services or custom processes. Provide basic visualization, analysis and extraction functions through services on an open platform. The platform additionally provides the potential processing capacity for building unforeseen custom workflows and processes against big data. Analysis Ready Data We believe over years the download mentality will diminish. Storing data in a ready-to-be-used format will allow users to access data without downloading. Service Functions will be available for basic visualization, analysis and extraction of data. Only download what is needed - perhaps the results rather than all the raw data. Virtual Platforms, like current commercial clouds, will mature and provide cost-effective, on-demand capacity to process big data. Custom processes and workflows can be supported by allowing users to spin up large infrastructure components, process the data, and shutdown without ongoing costs. Slide Source: Glenn Guempel, USGS #LPBigData
16
Emergent Themes from LPBigData
Loosely-coupled PB archives for rapid geospatial information product creation at any scale based on open standards Inter-cloud communications; portability across clouds Workflow based on interfaces exposed by containers Analysis Ready Data We live in a download mentality. How do we move to answering questions Focus shifting from understanding what happened last week to being able to predict what will happen next week Data Models Global Grids, Space Filling Curves, Indices Arrays, Tiles, Point Clouds Take advantage of broad developments in Big Data Accelerate Big Data innovation in Testbed 13
17
Use Cases for Big Geo Data
High Velocity Ingest Geospatial Databases Entity-oriented Spatial-temporal analytics Grid-oriented Spatial-temporal analytics Feature Fusion GeoAnalytics, Machine Learning Remote sensed data processing Machine Learning Spatial Modeling IoT Message Streaming Built environment models Array databases Users and consuming apps Social Media Message Processing Observation Sources NoSQL databases Integrated environmental models ETL Stream processing using RDF Graph databases Modeling and simulation Wide Area Motion Imagery SQL databases
18
Location Powers: Big Linked Geodata
How can we make sense of big data? Every two days the human race is generating as much data as was generated from the dawn of humanity through the year 2003. Most of that data has a location component. Developments in Semantic Web make it possible to link data based on geographic information in a way that provides more insight. Databases now holding over 1 trillion RDF triples provide unique, standards-based, big data capabilities. Investigate scaling effective exploitation of linked geodata by using big data approaches.
19
Some questions for the day
How can we use location linked data to make sense of big data? What are the challenges when scaling linked data to big data spatial analytics? Is Linked Data a way to achieve Analysis Ready Geospatial Data?
20
Location Powers: Big Linked Geodata
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.