Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Strategies For Optimizing Knowledge Derivation From Imagery

Similar presentations


Presentation on theme: "Exploring Strategies For Optimizing Knowledge Derivation From Imagery"— Presentation transcript:

1 Exploring Strategies For Optimizing Knowledge Derivation From Imagery
Dan Getman Geospatial Big Data Solutions DigitalGlobe 9/20/2016 Location Powers

2 Big Data, Take One 70 petabytes of satellite imagery
~30 gigabytes per satellite image Billion dollar satellites Covering the globe annually Expensive… Slow… Heavy… Big… Data 70 PB is a lot: big tech companies like Facebook have order of magnitude more data. But our data is HEAVY. 30 gigs per image. we’re not talking tweets that you can map and reduce all over the place with traditional big data systems. We’ll need new approaches. This is the opposite of realtime, fast, scalable, & cheap. This is old school DG

3 Big Data, Take Two All the Buildings, Planes, Cars, Boats, Roads, and Things All the Changes What does it all Mean… It means coordinating between Heavy/Slow Big Data and Light/Fast Big Data It means making image analysis faster, more dynamic, and less intrusive to Knowledge Discovery All that being said, we need to be ready to support the rest of the big data folks Folks who don’t really care about imagery, they just want the knowledge This is new school DG Important to note that we are dealing with many of the problems that plague big data analysis

4 Analysis Use Cases I care about spectral integrity vs.
I would be just as happy with a JPEG I need the whole image strip vs. Why would I want more than a chip? I want to have the image forever and ever vs. I don’t want to see the image at all, just the answer please

5 All Trying to Accomplish the Same Thing:
Make it fast Make it flexible Make it integrate with Data Science

6 Analysis Paradigms (a few anyway)
Data Provider Cloud Based Organized Image Store Cloud Based Scalable Compute Cloud Based WMS or Tile Store Cloud Based Scalable Compute

7 Analysis Paradigms at DigitalGlobe
Cloud Based Image Processing Framework Image Catalog Provider Defined Processing User Defined Processing Scalable Compute Cloud Based Dynamic Tile Creation Object Based Image Store Provider Defined Processing User Defined Processing Cloud Based Scalable Compute

8 Analysis Paradigms: Whole Image
Cloud Based Image Processing Framework Image Catalog Provider Defined Processing User Defined Processing Scalable Compute Analyst never touches or purchases imagery, just information Analyst can run their own algorithms or anyone else's Leverages compute size needed for each process Parallelized on nodes and through data distribution across nodes Configured for processing at the state, national, continental scale Configured for processing all imagery that meets certain specifications as it is collected

9 Analysis Paradigms: Whole Image
Raw Image Starts with “Raw” Tile Orthorectify Compensate for Atmosphere Pan Sharpen User Defined Process Specified Through the API User Defined Function Other Standard Function Output can be imagery, vector, tabular data User Defined Function Create Output Rest Endpoint

10

11 We Get Captured at Different Times

12

13 We All Run In Parallel As Data Arrives

14

15

16

17

18

19

20

21

22

23

24 Analysis Paradigms: Parts and Pieces
Cloud Based Dynamic Tile Creation Object Based Image Store Provider Defined Processing User Defined Processing Cloud Based Scalable Compute Chips are sized to significantly reduce compute costs No attached storage, just memory Object store => Highly Parallel data access (no compute needed) Deferred Processing => Highly flexible analysis and significantly reduced storage Restful => Easily integrated into Data Science/Big Data Paradigms

25 Analysis Paradigms: Parts and Pieces
Raw Tile Obj Starts with “Raw” Tile Orthorectify Compensate for Atmosphere Pan Sharpen User Defined Process Specified Through the API User Defined Function Other Standard Function Output can be imagery, vector, tabular data User Defined Function Create Output Rest Endpoint

26 Analysis Paradigms: Parts and Pieces
Raw Tile Obj Orthorectify Highly Available Super-high Bandwidth Super-high Parallelization Accessible at rest without compute/indexing in front Cost Effective Low/No Storage Processing on Very Small Instances Random Access Full-fidelity data Compensate for Atmosphere Pan Sharpen Other Standard Function User Defined Function Create Output Rest Endpoint

27 Analysis Paradigms: Parts and Pieces and ML
Vector Store Model Training API For Streaming Compute Object Store (Imagery)

28 (format optimized for interpretation)
Analysis Paradigms: Parts and Pieces and ML Vector Store Model Training Visualize Imagery (format optimized for interpretation) API For Streaming Compute Object Store (Imagery)

29 Analysis Paradigms: Parts and Pieces and ML
Vector Store Select Training Areas Model Training API For Streaming Compute Object Store (Imagery)

30 Access Imagery (format optimized for model training)
Analysis Paradigms: Parts and Pieces and ML Vector Store Train Samples (user defined model) Model Training Access Imagery (format optimized for model training) API For Streaming Compute Object Store (Imagery)

31 Compute and Visualize Detections in Selected Imagery. Repeat.
Analysis Paradigms: Parts and Pieces and ML Vector Store Compute and Visualize Detections in Selected Imagery. Repeat. Model Training API For Streaming Compute Object Store (Imagery)

32 Analysis Paradigms: Parts and Pieces and ML
This paradigm allows feature extraction and machine learning on imagery to be more easily integrated into existing big data and data science methodologies. Now that the data is smaller and more targeted, we can map and reduce it all we want. We can call imagery and knowledge from imagery directly as a service Rather than determining which strips to order and calling a sales rep, etc.

33 Summary of Investigations:
Looking at future of data science and big data analysis to determine where we can meet it halfway Finding ways to ensure that getting data out of imagery is a fundamental tool of big data analysis without slowing that science down Balancing traditional image science and big data science so each can feed the other SpaceNet is an excellent example of this (

34

35 Any Questions So Far?


Download ppt "Exploring Strategies For Optimizing Knowledge Derivation From Imagery"

Similar presentations


Ads by Google