Section 4: see other Slide Show

Section 4: see other Slide Show
Redondo File as of December 30, 2017 Great Dismal Swamp, Virginia Data Analysis and Intelligent Systems Lab

Meeting September 13, 2017 Implementation Platform Azure Platform
Storyline of the paper (Romita) Datasets: Hand,…,… Submission Targets (Yongli) Presentations during the 9/15 UH-DAIS Meeting

Paper(s) Romita, Yongli, andDr. Eick and are working on
Contributions: Focus: Contour Risk Map Generation for Point-based and Gridded Measurement Datasets Methods considered: 1. griddinginterpolationcontouring applied to point datasets and 2. upscalingcontouring algorithms applied to gridded dataset Understanding the impact of grid-size, interpolation method, bandwidth,…(?) on the obtained maps. Similarity Assessment for Contour Maps---continue work of Redondo paper Evaluation Measures for Contour Maps; if we have different contour maps created by approaches we investigate—how can we say which one is better? Are there any differences between gridded or point-based datasets with respect to map quality? Experimental evaluation using a real-world case involving flood risk mapping Correspondence Contouring for “new” Method considering the Contour Maps generated by Method A as “Ground Truth”. Remark: This could also be a new contour map generation approach Question: Can we split the themes into 2 papers in some intelligent way?

Questions Team Paper to be Addressed Soon
We will need to decide on an implementation platform soon; do we re-implement the distance functions in Arc-GIS; what platform will we be using to implement the contour map quality assessment functions? Need to finalize a flood risk dataset for the paper soon…; moreover, Dr. Shah (the visitor from last Thursday) seems to have a lot of other contour maps we might also use as a benchmark; we need to follow up on this… Breaking the paper into 2 papers---if this makes sense---and to determine an order of writing the paper and to come up with a plan… Finding submission targets for the new paper(s) ?!?

Topics Planned Paper on Quality of Contour Maps
a. We develop procedures that create (risk) contour maps from gridded (maybe doing prior upscaling of the used fine-grained grid before contouring it) and point datasets; the later approach also involves using interpolation functions---and we use zonal analysis for the obtained contour polygons to obtain hotspot summaries; we use ArcGIS packages at the moment for this task but when we later want to publish this work we have to explain in the paper what contouring (or other) algorithm was used to create the results we publish in the paper; but ArcGIS does not seem to tell us which algorithm is under the hood of the used approach---e.g. the two methods we are develop employ a contouring algorithm, but we cannot say which one... Is this a common problem with ArcGIS or are my students not looking in the right place---before we used R for implementing similar software, but the R libraries always provide a proper reference for the algorithm that is implemented in the particular package. b. We also plan to investigate the impact of grid-size, interpolation method (including bandwidth used) on the quality of the obtained contour maps. c. and we already have some, initial methods to assess the similarity between pairs of contour maps

Topics Planned Paper QCM Continued
d. Currently, we investigate different methods to create (flood risk or crime risk) contour maps and we therefore want to assess and compare their quality: There is also the issue how many and which contour values one should chose for a contour map, but we are not so much interested in this issue—it has been investigated by other papers. Our scenario looks a little simpler: we have a dataset, e.g. average amount of daily rainfall in the month of July 2017 measured at 3000 different locations, and we use Method1 and Method2 to create contour maps with 0.5, 1, 1.5, 2,...,6 inches of rainfall contour lines ; moreover, in most of our work we assume that contour lines are closed polygons that could have holes (There are some issues with creating contour lines near the boundary of the data collection areA we might need to address…). How do we assess which of the two maps is "better“? Approaches: One approach is to assess the robustness of a particular contour map; we could expose the dataset it was created to some noise; e.g. Gaussian noise taken N(0,) where  could be set to: , 0.5* , 0.25*  where  is the standard deviation or Uniform noise taken values in [-, ], [-/2, /2], [-/4, /4], and then we measure agreement of the contour maps that were created for the original and modified map Moreover, we could use cross-validation to assess the predictive capabilities of the generated contour map; e.g. we create a contour map using a subset of the dataset (e.g. 80% of the samples) and then we could use the remaining 20% of the observations and see if they are put into the correct range by the contour map and compute overestimation, underestimation, and some numerical errors (numerical errors accesses how far it is out of the bounds set by the contour lines and if it is within the bounds the error is 0); unfortunately, I do not see how to use this approach in conjunction with the gridded approach. More importantly, the same statistics can also be computed for the contour map that was created using the complete dataset as an evaluation measures; therefore, it can be used as one evaluation measure to address issue of map quality. By analyzing geometric properties of the contour lines (smoothness, the largest area with smallest length(?), the evenness of the contour distribution,…) Another approach is to sample points near the contour line; and assess (e.g. by computing the squared error) how much they deviate from the contour value; these approach can be generalized to work for gridded datasets. For that, blankets need to be created for contour lines which is not trivial, as the presence of other close by contour lines needs to be considered when determining the size of the blanket. Perhaps we can find some software to shrink/enlarge polygons to create such blankets/ Supervised assessment; e.g. use the rainfall contour map generated for July 2016 to predict the rainfall in July 2017; we could use many of the previously mentioned approaches, such as comparing the similarity of the two maps. Another approach that could be used for a lot of methods mentioned above, it to treat contour maps as prediction models; e.g. we could then assess the similarity of two contour maps by just comparing their predicted values for a set of randomly selected locations in the observation area.

Point-Based Experiments
Develop/Use/Extend Zonal Analysis techniques to analyze contour polygon Use 3 thresholds to obtain Contour Maps Run the interpretation+contouring approach for 3 bandwidth values and 5 grid-sizes (use same as in the other experiment) obtaining 15 contour maps Analyze the obtained Contour Maps with the methods developed in step1 Interpret the results also comparing them to grid-based approaches

Grid-based Experiments
Develop/Use/Extend Zonal Analysis techniques to analyze contour polygon Use Different Pre-scaling thresholds e.g. 1, 4, 9, 25, 100 to obtain 5 gridded datasets. Use 3 thresholds to obtain Contour Maps Create 5 Contour Maps using the 5 datasets created in step 2 Analyze the Contour Maps with the methods developed in step1 Interpret the results also comparing them to point-based approaches

An Alternative Paper Paper Contributions:
Focus: Contour Risk Map Generation for Point-based and Gridded Measurement Datasets Methods considered: 1. griddinginterpolationcontouring applied to point datasets and 2. upscalingcontouring algorithms applied to gridded dataset Understanding the impact of grid-size, interpolation method, bandwidth,…(?) on the obtained maps. Similarity Assessment and Evaluation Measures for Contour Maps; if we have different contour maps created by approaches we investigate—how can we say which one is better and how can we assess how similar the created maps are? Are there any differences between gridded or point-based datasets with respect to map quality? Experimental evaluation using a real-world case involving flood risk mapping Correspondence Contouring for “new” Method considering the Contour Maps generated by Method A as “Ground Truth”. Remark: This could also be a new contour map generation approach optional contribution

Gridding InterpolationContouring Approach
This is a commonly used approach to a problem that has been investigated for the last 50 years, which creates some challenges in coming up with something that is novel and enhances the state of there art... What is unique about the methods we investigate? What about developing methods that select “optimal” parameters for existing methods; e.g. how do we determine the optimal grid size for the contouring approach? What about evaluation measures? How do we evaluate and compare different contour maps? Is there any novelty in developing such evaluation measures? How precise are the contour polygons we get? How do we measure precision? How do we deal with contour lines that should have been created but have not been created in quality assessment? What about robustness of the created contour lines? How do we incorporate density into our approach? How can the approach benefit from ground truth background knowledge in form of contour polygon trees? The same questions can be asked for the upscalingcontouring approach; however, even if only one approach has some novelty this should be okay, as we can use the other approach as a baseline of comparison.

Paper Contributions Framework for Similarity Assessment of Contour Polygon Forests Framework for Correspondence Contouring for “new” Method considering the Contour Polygon Forests generated by Method A as “Ground Truth” Fast Algorithms that can cope with large numbers of polygons in the two frameworks Agreement Mapping Approaches—creating spatial and spatio-temporal maps of agreement Real-world case involving flood risk mapping

…POLYGON Trees as Ground Truth for Other Methods…
Examples of Polygon Trees as Ground Truth: FEMA maps and other flood risk maps, elevation maps,… (need more); crime maps; reference the two crime papers! Need to use cheaper, parameterized, more automated methods for cost-reasons and to cope with outdated/low quality data; e.g. FEMA updates the flood risk maps of a regions about every 10(???) years Paradigm: Train the cheaper, parameterized methods using polygon trees as a ground truth for regions were ground truth is available to obtain the best parameter settings; then, apply the parameterized method with the learnt parameter settings for unprocessed, underserved regions to produce contour polygon trees Applications include: Flood risk mapping Elevation maps with cheaper methods Gridded datasets based on number of crimes committed

Creating DEMs Mappers may prepare digital elevation models in a number of ways, but they frequently use remote sensing rather than direct survey data. One powerful technique for generating digital elevation models is interferometric synthetic aperture radar where two passes of a radar satellite (such as RADARSAT-1 or TerraSAR-X or Cosmo SkyMed), or a single pass if the satellite is equipped with two antennas (like the SRTM instrumentation), collect sufficient data to generate a digital elevation map tens of kilometers on a side with a resolution of around ten meters[citation needed]. Other kinds of stereoscopic pairs can be employed using the digital image correlation method, where two optical images are acquired with different angles taken from the same pass of an airplane or an Earth Observation Satellite (such as the HRS instrument of SPOT5 or the VNIR band of ASTER).[12] The SPOT 1 satellite (1986) provided the first usable elevation data for a sizeable portion of the planet's landmass, using two-pass stereoscopic correlation. Later, further data were provided by the European Remote-Sensing Satellite (ERS, 1991) using the same method, the Shuttle Radar Topography Mission (SRTM, 2000) using single-pass SAR and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER, 2000) instrumentation on the Terra satellite using double-pass stereo pairs.[12] The HRS instrument on SPOT 5 has acquired over 100 million square kilometers of stereo pairs The quality of a DEM is a measure of how accurate elevation is at each pixel (absolute accuracy) and how accurately is the morphology presented (relative accuracy). Several factors play an important role for quality of DEM-derived products: terrain roughness; sampling density (elevation data collection method); grid resolution or pixel size; interpolation algorithm; vertical resolution; terrain analysis algorithm; Reference 3D products include quality masks that give information on the coastline, lake, snow, clouds, correlation etc. Methods for obtaining elevation data used to create DEMs[edit] Gatewing X100 unmanned aerial vehicleLidar Stereo photogrammetry from aerial surveys Structure from motion / Multi-view stereo applied to aerial photography[13] Block adjustment from optical satellite imagery Interferometry from radar data Real Time Kinematic GPS Topographic maps Theodolite or total station Doppler radar Focus variation Inertial surveys Surveying and mapping drones Range imaging

Research Themes Polygon Trees as the Ground Truth
Correspondence Contouringmaking results better comparable with the ground truth Idea: Post-process results based on the ground truth; e.g. intersect results with Fema Polygons Dealing with multiple forms of ground truth Idea: Using weighted set of cases in training, where cases were there is more agreement with respect to a single or multiple ground truth(s) How do we deal with granularity/smoothening/grid-size in creating local and regional summaries/risk polygons for risk maps Side Research Topic: Find Correspondence and Discrepancies between point-based and gridded datasets; case study: analyze the (address-)point-based hand datasets and gridded hand datasets to obtain a better understanding about the relationship of the two knowledge representation schemes.

Multi-Threshold Finding Problem to Maximize Match
Problem Specification: Given a polygon forests PF with k levels that has been obtained for a dataset D and a method M if applied with thresholds 1…,k to dataset D’ obtains a Polygon forest P’; moreover, we have a polygon forest distance function dpf . We are trying to find PF’ such that: dpf (PF,PF’) is minimal with PF being PF=M(D’,(1…,k )) In other words, given D’, M, and PF we like to find values 1…,k such that dpf (PF, M(D’,(1…,k )) is minimal

Algorithm to Find the Best March DPF
Level-based Algorithm: For levels i=1 to i=k do {Find I such that dpf (M(D’, I )=level(i, PF) Is minimal} Return (1…,k ) Where level(I,PF) is a functions that returns all level i polygons; that is, this function returns are set of polygons.

Problems Gridding InterpolationContouring Framework
It looks to me that there is something fundamentally wrong for the results we get for the original hand datasets and we should try to understand what the problem is: Need to develop framework soon (visualization and analysis) that automatically allows us to debug and evaluate different frameworks/versions/parameters such as grid-size in the approaches we investigate. Why do the results so much disagree with those for FEMA-intersected hand dataset? Why do we miss obvious hotspots? We also get hotspots outside FEMA risk zones and we should check, if those make any sense… We should add a statistical hotspot summary (hand mean, std and #address points in the hotspot) to the result facilitating interpretation. We should get a better understanding about the impart of the grid-size on the results We should better understand the impact of empty grid-cells on results---if there is any; in general, using smaller grid-sizes should enhance the precision of the obtained hotspot boundaries, but as a by-product we get empty cells. Obtained hotspot polygons seem to be too wide and might need some tightening; e.g. by computing the convex (concave??) hull. We should also check for bugs in the algorithms (and maybe even the dataset) Remark: Need some debugging technology to investigate such questions

Inverse Distance Weighting
Remark: Likely we should find a package that does the same thing or something similar! Is this the approach, we are currently investigating?!? Make a grid (or multiple grids using different grid sizes) Make a grid and fill it with values using two approaches: Interpolate the value in the center of the grid cell using only the points in the grid cell using inverse distance weighting ( ). If there are no points, report null as the value of the cell. Interpolate the value at the grid intersection points using inverse distance weighting just using the points in the 4 (3 at boundaries, 2 at corners) cells where the intersection point is part of the grid cell. ) If there are no points, report null as the value of the intersection point, Create Summaries using the annotated grids Using grid-based heart maps Using contouring algorithms Using ?!?

UH-DAIS Research to Address these Problems
Computational Methods to create flood risk maps from point-wise or grid-based flood risk assessment (e.g. hand value maps or elevation maps). We investigated in the past and are currently investigating: graph-based approaches Continuous function based approaches Similarity Assessment Methods to find Agreement Between Different Polygonal Flood Risk Maps Correspondence Contouring Methods (e.g. find a sequence of elevation thresholds so that the obtained contour polygons best match with FEMA flood risk zones) Computing Agreement Maps between Different Methods Creating “better” flood risk maps by combining information from different sources. Assessing the Quality of Contour Maps UH-DAIS

Other Flooding-Related Priorities
Need to be able to extract different types of polygons based on their attribute values; distinguish the at least three levels of flood risk maps for Austin Fire and A and X-shaded/B flood risk zones in FEMA maps. Find and extract the flow data for AE FEMA flood risk zones Learn how to use HEC-RAS Romita/Qian should be able to run/modify Yongli’s program Is there any software available in ArcGIS (or R) that uses an interpolationcontouring approach? Are there any interpolation-based heat maps? Are there any good papers on the interpolationcontouring theme? Are there any other approaches that are useful for what we have in mind? Should look at some work on upscaling, particularly for gridded datasets Get our hands on digital elevation maps (DEM) and use them in a similar fashion as Austin Fire, FEMA, or Hand-based risk maps! UH-DAIS

Other Things to Check Out
According to rumors, the US Corp of Engineers is working on a “new” HEC-RAS Checkout Situation Awareness for Everyone (SAFE; non-HEC-RAS approaches; very interesting company that makes business with flood control districts… Jared Allen (very interesting slide show ( UH-DAIS

Other Research Themes How can past knowledge from floods be used for flood risk assessment? What data do we actually have available from past floods? How can we make flood risk assessment approaches sensitive to current water levels and the anticipated amount of rainfall in the in the near future (e.g. in the next six hours)? Can the National Water Model be used for Flood Risk Assessment? If yes, how? How is it different from HEC-RAS? What could we do concerning creating flood risk maps for Wharton county? UH-DAIS

Popular Interpolation Methods for Spatial Data
Here, we're basically talking more or less about interpolation methods. Methods include: IDW Depending on the implementation this can be global (using all available points in the set) or local (limited by number of points or maximum distance between points and interpolated position). Tools: QGIS interpolation plugin (global), GRASS v.surf.idw or r.surf.idw (local) Splines Again, huge number of possible implementations. B-Splines are popular. Tools: GRASS v.surf.bspline Kriging Statistical method with various sub-types. Tools: GRASS v.krige (thanks to om_henners for the tip) or using R. Triangulated Irregular Neworks (TIN) added Remark: As there is an abundance of interpolation approaches, I suggest to focus Just on the most popular ones.

Section 4: see other Slide Show

Similar presentations

Presentation on theme: "Section 4: see other Slide Show"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Section 4: see other Slide Show

Similar presentations

Presentation on theme: "Section 4: see other Slide Show"— Presentation transcript:

Similar presentations

About project

Feedback