C-squares - a new simple, XML friendly, display/ query/ exchange format for representing spatial data extents at the metadata level System concept and.

Slides:



Advertisements
Similar presentations
Tony Rees Divisional Data Centre CSIRO Marine Research, Australia Application of c-squares spatial indexing to an archive of remotely.
Advertisements

Configuration management
Routemap to derive ISO models from BUFR Why do we need both ISO and BUFR models? –The BUFR data model is very large – much larger in principle than most.
CSIRO Marine Research Divisional Data Centre Current and Future Activities Tony Rees, Data Centre Manager April 2004.
METS: An Introduction Structuring Digital Content.
Page 1© Crown copyright 2006 Registry technology & case study implementation J. Tandy, D. Thomas - November 2006.
Geospatial One-Stop A Federal Gateway to Federal, State & Local Geographic Data
Center for Modeling & Simulation.  A Map is the most effective shorthand to show locations of objects with attributes, which can be physical or cultural.
Evolving concepts in the architecture of OBIS, the Ocean Biogeographic Information System Tony Rees CSIRO Marine Research 29 November 2004 Phoebe Zhang.
Spatial Information Integration Services (SIIS) ISO/TC211 Workshop on Standards in Action Adelaide, South Australia October 2001 Mr. Neil Sandercock, SA.
Rapid spatial indexing and web mapping using the “c-squares” global grid Tony Rees Manager, Divisional Data Centre 23 March 2007 CSIRO.
OBIS Australia – Regional Node for the Ocean Biogeographic Information System (OBIS) OBIS Australia is an operational component of the Census of Marine.
Overview of key concepts and features
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
Spatial Indexing, Search, and Mapping for Species level databases Tony Rees, CSIRO Marine and Atmospheric Research (CMAR), Hobart, Tasmania, Australia.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
For Mapping Biodiversity Data Data Management Options.
Information Retrieval in Practice
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
Tony Rees and Glenelg Smith Divisional Data Centre + Remote Sensing Facility CSIRO Marine Research, Australia Application of c-squares.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
 Image Search Engine Results now  Focus on GIS image registration  The Technique and its advantages  Internal working  Sample Results  Applicable.
Developing Custom GIS Applications to Explore Digitally Vectorized Geologic Quadrangles Mark Graham, Dr. Andrew Wulff, Department of Geography and Geology,
Tony Rees – C-squares Oct Nested Grids: the c-squares global grid Tony Rees CSIRO Marine and Atmospheric Research, Hobart for:
Marine GIS Applications using ArcGIS Global Classroom training course Marine GIS Applications using ArcGIS Global Classroom training course By T.Hemasundar.
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
Metadata Understanding the Value and Importance of Proper Data Documentation Exercise 2 Reading a Metadata File Exercise 3 Using the Workbook Exercise.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
C-squares - a simple, XML friendly, query/ display/ exchange format for representing spatial data at the metadata level System concept and development.
OBIS Portal Architecture Concepts plus potential for utilization as a basis for Regional OBIS Nodes Tony Rees, CSIRO Marine Research, Hobart (and OBIS.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
Data input 1: - Online data sources -Map scanning and digitizing GIS 4103 Spring 06 Adina Racoviteanu.
Mapping between SOS standard specifications and INSPIRE legislation. Relationship between SOS and D2.9 Matthes Rieke, Dr. Albert Remke (m.rieke,
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
How do we represent the world in a GIS database?
MAY SEA-SEARCH MEETING CYPRUS METADATA and XML in compliance with ISO Michèle FICHAUT, IFREMER/SISMER Gilbert MAUDIRE, IFREMER/ISI Mickaël.
OBIS and species distributions Tony Rees discussion presentation, March 2003 Some fundamental intentions for OBIS... –Choose any species and discover its.
CHAPTER 4 RASTER DATA MODEL 4.1 Elements of the Raster Data Model
Tony Rees Divisional Data Centre CSIRO Marine Research, Australia Metadata concepts, issues and experiences – lessons from 8 years.
MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:
CSIRO Marine Research Data Centre linked databases - CAAB, MarLIN and Divisional Data Warehouse.
GIS Data Structures How do we represent the world in a GIS database?
NDD (National Oceans Office Data Directory) development overview as at 1 July 2002 Tony Rees/Miroslaw Ryba CSIRO Marine Research, Hobart.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
MarLIN - CSIRO Marine Laboratories Information Network.
CAAB and taxon management at CSIRO Marine Research Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart
MarLIN: a research data metadatabase for CSIRO Marine Research Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart contact:
System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,
What is GIS? “A powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data”
C-squares concept: Data items are represented by the grid squares in which they are located 1: Data items2: Data items and relevant grid squares 3: Grid.
Chapter 3- Coordinate systems A coordinate system is a grid used to identify locations on a page or screen that are equivalent to grid locations on the.
System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,
3rd Training Workshop June 2008, Ostende Management of CSR Anne Che-Bohnenstengel, BSH  Metadata Formats  Defined Vocabularies  Content Management.
General Architecture of Retrieval Systems 1Adrienn Skrop.
® Sponsored by Improving Access to Point Cloud Data 98th OGC Technical Committee Washington DC, USA 8 March 2016 Keith Ryden Esri Software Development.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Information Retrieval in Practice
Product Training Program
Geocoding and Georeferencing
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Tony Rees CSIRO Marine Research 29 November 2004
Flanders Marine Institute (VLIZ)
Chapter 4 Entity Relationship (ER) Modeling
Presentation transcript:

c-squares - a new simple, XML friendly, display/ query/ exchange format for representing spatial data extents at the metadata level System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia

Characteristics of metadata, and metadata spatial searches Problems with “bounding rectangles” as representations of dataset extents The c-squares concept c-squares in practice Future possibilities Topics to be covered...

Metadata, and spatial searching of metadata records

The Metadata concept... (Data level) Data Store 1 databases / data warehouses Data Store 2 offline digital data offline nondigital data (Metadata level) Metadata records (structured dataset descriptions) - as text files, database, or XML format dataset descriptions in standard format metadata query and/or exchange

some example Metadatabases (Data Directories)... Metadata records exist independently of the datasets they describe, may not necessarily have on-line connection to the actual data --- i.e., they act as surrogates for the data Spatial searching (where implemented) typically by bounding rectangles (N,S,W,E limits) or sometimes defined regions (R1 yes/no, R2 yes/no, etc.) + many others < 1000?...

Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) (etc. etc.) Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) (etc. etc.) current “first pass” representation of spatial data coverage is by bounding coordinates - example: concept introduced in FGDC draft metadata standard, 1994 used for distributed spatial searching, 1995 onwards still the primary tool for conducting metadata spatial searches; integral to ISO draft metadata standard, 2002 polygons are also enterable, but seldom used for searching owing to the arithmetic overhead involved “Bounding rectangles” test: if search rectangle (blue) overlaps data rectangle (red), a supposed “hit” is returned : hit no hit false hit

Bounding coordinates - pluses and minuses Pluses... Metadata elements are concise User-entry is simple Spatial searching is simple arithmetic operation (looks for overlap between a “search” rectangle and available “data” rectangles) Useful as a “first pass” -- rapidly filters out many datasets not close to the region of interest Minuses … A rectangular shape does not correspond to the actual shape of many datasets Data distribution may be aligned along other than N-S or E-W axes Data distribution may be patchy or incomplete within the designated boundary Corollary … Apparent “hits” never 100% reliable (unless the data are always rectangular, e.g. mapsheets)

Some real-world examples (other agencies’ data)...

our agency’s data (marine surveys) - examples... NB, “bounding rectangle” searches result in many false or misleading hits, since large portions of the “dataset” rectangles contain no data - particularly where surveys wrap around a feature or land area, or are oriented obliquely with respect to N-S, or E-W directions.

state divided into 0.5 x 0.5 º squares (numbered as per relevant mapsheets) search interface has direct connection to base data (>100,000 point data records) each base data record is tagged with its relevant mapsheet number, so spatial searching is by simple numeric/text match (no arithmetic required) user can request list of hits (species) from one or multiple search squares (e.g. blue hatched examples) Germ of c-squares concept... from Ken Walker’s Bioinformatics search interface, Museum Victoria (Australia) 700 km

multiple square id’s could be stored in single metadata record (harvested from base data) - removes requirement to access the base data to answer search queries numbering system should be expanded to become globally applicable geographic scale (size of squares) should be variable up or down to suit variety of user needs metadata records become storage vehicles for dataset “footprints” (simple spatial objects) modifications which would be required for use with metadata km

The “c-squares” concept c-squares: Concise Spatial Query and Representation System

“c-squares string” holds ID’s of all the tiles (e.g. 1 x 1, 0.5 x 0.5 degree squares) which are intersected by the dataset spatial extent (footprint) “c-squares” principle data “footprint” using bounding rectangle data “footprint” using 1 x 1 degree c-squares actual survey location - “Franklin” cruise 10/87 same using 0.5 x 0.5 degree c-squares

each square is numbered according to a globally applicable system based on recursive divisions of WMO (World Meteorological organisation) 10- degree squares, e.g.: 10 degree square: 3414 (= WMO number) 5 degree square: 3414:2 1 degree square: 3414: degree square: 3414:227:4 0.1 degree square: 3414:227:466 (etc.) strings of codes represent an individual dataset extent, e.g. 3013:497|3111:468|3111:478|3111:479|3111:488|3111:489|3111:499|3112:122|3112:123| 3112:131|3112:132|3112:134|3112:141|3112:142|3112:143|3112:217|3112:218|3112:219| 3112:226|3112:235|3112:350|3112:351|3112:352|3112:353|3112:360|3112:361|3112:362| 3112:363|3112:370|3112:371|3112:380|3112:381|3112:390|3113:100|3113:101|3113:102| 3113:103|3113:104|3113:205|3113:206|3113:207|3113:216|3113:217|3113:228|3113:238| 3113:239 encodes the extent shown in the example: “c-squares” numbering system

WMO 10-degree squares notation (part) (Available via the web in NODC, 1998: World Ocean Database 1998 Documentation)

WMO 10-degree squares notation principle NE sector (1xxx) SE sector (3xxx) NW sector (7xxx) SW sector (5xxx)

follows “Blue Pages” (1996) extension of WMO numbering, using 4 quadrants (1, 2, 3, 4) for 5-degree squares - e.g. within 10-degree square nomenclature for 5-degree squares - e.g. in SE sector: WMO 10-degree square 3414 (grey) 5-degree square 3414:2 (light blue) (1 is always closest to global origin, 4 is always furthest away. For full specification refer c-squares website)

follows “Blue Pages” (1996) extension of WMO numbering, using 4 quadrants (1, 2, 3, 4) for 5-degree squares, plus 2 digits for 1-degree squares - e.g. within 10-degree square nomenclature for 1-degree squares - e.g. in SE sector: WMO 10-degree square 3414 (grey) 5-degree square 3414:2 (light blue) 1-degree square 3414:227 (green) (100 is always closest to global origin, 499 is always furthest away. For full specification refer c-squares website)

Codes have straightforward relationship with lats/longs, mapsheets, etc.... e.g.: 3414:227 (1-degree square with origin at 42 º S, 147 º E) additional degrees E [140+7] =147 additional degrees S [40+2] = 42 5-degree quadrant, i.e tens of degrees E (i.e., 140) tens of degrees S (i.e., 40) global sector (1=NE, 3=SE, 5=SW, 7=NW) 70 km

example: 3212:*** can be used instead of specifying every 1-degree square within 10 degree square This leads to corresponding data reduction, e.g. Australia (at 1-degree resolution) can be described in 343 squares rather than 800: “quad tree” -type approach used where numerous adjacent squares are occupied

Example database-level implementation of c-squares for metadata records (e.g. at 1 degree resolution) (etc.)

automated conversion of lat/long data to c-squares (ignoring multiple hits) automated conversion of GIS polygon data to c-squares extents clickable map interface for region(s) of immediate interest manual entry, with reference to marked-up mapsheet/s on-line lat/long - to - c-square converter custom digitising system (graphics tablet data input or similar) Options for c-squares data entry :130:1 3315:130:2 3315:131:1 3315:130:4 3315:131:3 3315:130:3 mapsheet marked with 0.5 degree squares - for manual entry clickable map interface (generalised example)

c-squares strings can be transformed into coordinate pairs (centre point of squares) and square size, by an appropriate function and then sent to Xerox PARC Map Viewer or similar, e.g.: Process invoked for web mapping (1)

c-squares strings can be sent directly to the CMR c-squares mapper (accessible via the web), e.g.: Process invoked for web mapping (2)

(Base maps are automatically chosen to fit the data range, or can be selected manually) Further examples (CMR oceanographic/biological data x 0.5 deg. squares):

c-squares spatial queries simply test whether a text string representing the search box (ideally one or several c-squares) is matched anywhere in the c-squares string … example: - search square 3113:2 will match any c-squares string which includes 3113:2 within it, e.g.: 3013:497|3111:468|3111:478|3111:479|3111:488|3111:489|3111:499|3112:122|3112:123| 3112:131|3112:132|3112:134|3112:141|3112:142|3112:143|3112:217|3112:218|3112:219| 3112:226|3112:235|3112:350|3112:351|3112:352|3112:353|3112:360|3112:361|3112:362| 3112:363|3112:370|3112:371|3112:380|3112:381|3112:390|3113:100|3113:101|3113:102| 3113:103|3113:104|3113:205|3113:206|3113:207|3113:216|3113:217|3113:228|3113:238| 3113:239 (NB, this is a simple text search and involves no arithmetic - cf. querying of bounding rectangles, polygons, or more complex spatial objects) hierarchical naming system for c-squares means that finer resolution squares are automatically picked up in any “coarser resolution” search Mechanism for spatial queries using c-squares

Implementable as a simple “click on a square” interface, e.g.:

… system does the search - checks for c- squares match if available (provides reliable matches), otherwise uses overlapping rectangles test (“possible match”)...

produces... (etc.)

Viewing the full metadata record produces... (etc.) with clickable link to show dataset extent using c-squares:

Base maps for displayed data can be changed at will by the user, e.g.: (numerous other maps available, sample only shown)

Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) :499:2|3112:390:1|3111:489:3|3112:380:3|3112:380:4|3112:381:1|3111:488:2|3112:381:2| 3112:371:3|3111:478:4|3112:370:4|3112:370:1|3111:478:1|3111:479:2|3111:479:1|3112:361:4|3111:468:4|311 2:363:3|3112:361:3|3111:467:2|3112:360:2|3112:363:1|3112:362:2|3112:360:1|3112:352:4|3112:352:3|3112:3 50:4|3112:352:1|3112:351:2|3112:352:2|3112:353:2|3112:353:1 Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) :499:2|3112:390:1|3111:489:3|3112:380:3|3112:380:4|3112:381:1|3111:488:2|3112:381:2| 3112:371:3|3111:478:4|3112:370:4|3112:370:1|3111:478:1|3111:479:2|3111:479:1|3112:361:4|3111:468:4|311 2:363:3|3112:361:3|3111:467:2|3112:360:2|3112:363:1|3112:362:2|3112:360:1|3112:352:4|3112:352:3|3112:3 50:4|3112:352:1|3112:351:2|3112:352:2|3112:353:2|3112:353:1 c-squares strings are suitable for inclusion as a new XML metadata element, for example...

7500:123:4 50 x 30 km 7500:123: x 6 km 7500: x 60 km (NB, “real” shape and dimensions vary according to position on globe) 7500:1 500 x 300 km WMO Square 7500 Actual size of c-squares, e.g. compared to U.K. : x 600 km 1 x 1 degree squares is suggested as a possible minimum standard of spatial encoding for global interoperability of metadata systems (finer resolution available to users on as-needs basis) 10 x 10 deg. 5 x 5 deg. 1 x 1 deg. 0.5 x 0.5 deg. 0.1 x 0.1 deg.

Summary - strengths and weaknesses of c-squares Strengths... “c-squares” metadata element is a concise and flexible way of encoding a wide variety of different spatial objects - including nonlinear and incomplete (patchy) coverages automated or manual code entry (and maintenance) is possible, and relatively simple spatial searching is simple text string matching operation -- no supporting GIS system is required ( i.e., zero technological overhead) “c-squares mapper” utility provides rapid and flexible data extent visualisations, and can be called from anywhere via the web can be implemented progressively into any metadata system as an adjunct to bounding coordinates (a search can be configured to work with whatever is available) Weaknesses … may not be the only numbering convention available (Marsden Squares and Maidenhead Locators are alternatives to WMO squares, however less suitable in this application) c-squares are not uniform shape/size across the earth’s surface (true squares only at the equator); some local/national grids do not transform easily to lat/long squares may be cumbersome to encode very large, complex regions (e.g. “Pacific Ocean”) by this method - works best at continental scales and below.

other comments... “c-squares” notation is language-independent - can be equally used in English, French, Japanese … also discipline- independent (suitable for physical, biological, geological, topographical, plus any other data type) downwards-scalability of the c-squares notation means that it can be applied to any size region (e.g. local level) equally applicable to terrestrial and marine data no equivalent in GML notation at this time (GML only supports vector data). Even if there were a GML equivalent, c- squares would still be significantly more concise.

c-squares is being implemented progressively in CSIRO Marine Research’s “MarLIN” metadata system (c. 500 records to date, more continuously added) and in the CMR “CAAB” marine species dictionary (c records). MarLIN c- squares search interface is already operational c-squares is freely available for implementation in any other agencies’ metadata systems. Possibly small “islands of interoperability” could be created, or system could simply be implemented for within-agency use c-squares could be offered to relevant user community/national bodies as an optional metadata element - possibly as a user-defined extension to a recognised metadata standard (e.g. ANZLIC, ISO) current CMR c-squares mapper is already accessible for general use. Global and selected regional mapping options already available and can be developed further. External systems already linking to the c-squares mapper include OBIS (Ocean Biogeographic Information System, USA) and FishBase (ICLARM/FAO), as well as CMR’s MarLIN and CAAB databases c-squares website ( is a focal point for all c- squares related materials - including specification, background information, sample code, on-line lat/long converter, sample c-squares-enabled metadata records, and more c-squares future...

Potential Implementation across multiple systems (non c-squares enabled) (c-squares enabled - whole or part) catalogue 1 metadata query and/or exchange with c-squares + bounding rectangles catalogue 3 catalogue 2 Single or multi catalogue query with c-squares Single or multi catalogue query with bounding rectangles metadata query and/or exchange with bounding rectangles

Acknowledgements/Inspiration... Ken Walker (Museum Victoria) for showing me his Museum Victoria Bioinformatics search interface, based on 0.5 degree squares “Blue Pages” Marine and Coastal Data Directory (MCDD) for the notation for subdividing WMO squares, also for pointers to software for drawing rectangles on GIF images (as used in the c-squares mapper) and for point- and-click map searching CMR Data Centre staff for useful feedback Miroslaw Ryba (CMR) for programming assistance with the c-squares mapper John Hockaday (Geoscience Australia) and Doug Nebert (FGDC, USA) for helpful comments on prototype versions of the system NOAA “GLOBE” Project and Martin Dix, CSIRO Atmospheric Research for provision of backdrop images used in the c-squares mapper.

Questions, comments?