Presentation is loading. Please wait.

Presentation is loading. Please wait.

System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,

Similar presentations


Presentation on theme: "System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,"— Presentation transcript:

1 System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying, displaying and exchanging dataset spatial extents

2 Characteristics of metadata, and metadata spatial searches Problems with “bounding rectangles” as representations of dataset extents The c-squares concept c-squares in practice Future possibilities Topics to be covered...

3 Metadata, and spatial searching of metadata records

4 The Metadata concept... (Data level) Data Store 1 databases / data warehouses Data Store 2 offline digital data offline nondigital data (Metadata level) Metadata records (structured dataset descriptions) - as text files, database, or XML format dataset descriptions in standard format metadata query and/or exchange

5 some example Metadatabases (Data Directories) Metadata records exist independently of the datasets they describe, i.e., they act as surrogates for the data for search purposes Spatial searching (where implemented) typically by bounding rectangles (N,S,W,E limits) or sometimes defined regions (R1 yes/no, R2 yes/no, etc.) + many others -- 100 < 1000?...

6 Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) -9.0 -19.0 117.0 145.8 (etc. etc.) Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) -9.0 -19.0 117.0 145.8 (etc. etc.) current “base level” representation of spatial data coverage is by bounding coordinates - example: concept introduced in FGDC draft metadata standard, 1994 used for distributed spatial searching, 1995 onwards still the primary tool for conducting metadata spatial searches; integral to ISO 19115 draft metadata standard, 2002 polygons are also enterable, but seldom used for searching owing to the arithmetic overhead involved

7 Bounding coordinates - pluses and minuses Pluses... Metadata elements are concise User-entry is simple Spatial searching is simple arithmetic operation (looks for overlap between a “search” rectangle and available “data” rectangles) Useful as a “first pass” -- rapidly filters out many datasets not close to the region of interest Minuses … A rectangular shape does not correspond to the actual shape of many datasets Data distribution may be aligned along other than N-S or E-W axes Data distribution may be patchy or incomplete within the designated boundary

8 Problems with “bounding rectangles” searches... “Overlapping rectangles” test: if search rectangle (blue) overlaps data rectangle (red), a supposed “hit” is returned. “False hits” can result if data does not fill its bounding rectangle: hitno hit false hit -------------- data --------------- data bounding rectangle ---------------- search rectangle

9 Some real-world examples (other agencies’ data)...

10 our agency’s data (marine surveys) - examples... (NB, in our datasets, large portions of the data bounding rectangles may be empty)

11 state divided into 0.5 x 0.5 º squares (numbered as per relevant mapsheets) search interface has direct connection to base data (>100,000 point data records) each base data record is tagged with its relevant mapsheet number, so spatial searching is by simple numeric/text match (no arithmetic required) user can request a reliable list of hits (species) from one or multiple search squares (e.g. blue hatched examples) Precursor to c-squares concept... from Ken Walker’s Bioinformatics search interface, Museum Victoria (Australia) 700 km

12 multiple square id’s could be stored in a single metadata record, harvested from base data - removes requirement to access the raw data to answer search queries numbering system should be replaced with something globally applicable geographic scale (size of squares) should be variable up or down to suit variety of user needs metadata records become storage vehicles for dataset “footprints” (simple spatial objects) modifications which would facilitate use with metadata... 700 km

13 The “c-squares” concept c-squares: Concise Spatial Query and Representation System

14 “c-squares string” holds ID’s of all the tiles (e.g. 1 x 1, 0.5 x 0.5 degree squares) which are intersected by the dataset spatial extent (footprint), e.g.: “c-squares” principle data “footprint” using bounding rectangle data “footprint” using 1 x 1 degree c- squares actual survey location - “Franklin” cruise 10/87 same using 0.5 x 0.5 degree c- squares

15 each square is numbered according to a globally applicable system based on recursive divisions of WMO (World Meteorological organisation) 10- degree squares, e.g.: 10 degree square: 3414 (= WMO number) 5 degree square: 3414:2 1 degree square: 3414:227 0.5 degree square: 3414:227:4 0.1 degree square: 3414:227:466 (etc.) strings of codes represent an individual dataset extent, e.g. 3013:497|3111:468|3111:478|3111:479|3111:488|3111:489|3111:499|3112:122|3112:123| 3112:131|3112:132|3112:134|3112:141|3112:142|3112:143|3112:217|3112:218|3112:219| 3112:226|3112:235|3112:350|3112:351|3112:352|3112:353|3112:360|3112:361|3112:362| 3112:363|3112:370|3112:371|3112:380|3112:381|3112:390|3113:100|3113:101|3113:102| 3113:103|3113:104|3113:205|3113:206|3113:207|3113:216|3113:217|3113:228|3113:238| 3113:239 encodes the extent shown in the example: “c-squares” numbering system

16 WMO 10-degree squares notation (part) (Available via the web in NODC, 1998: World Ocean Database 1998 Documentation)

17 WMO 10-degree squares notation principle NE sector (1xxx) SE sector (3xxx) NW sector (7xxx) SW sector (5xxx) 1017 3000 3017 5000 3800 5800 7000 5017 7017 7800

18 follows “Blue Pages” (1996) extension of WMO numbering, using 4 quadrants (1, 2, 3, 4) for 5- degree squares - e.g. within 10-degree square 3414... nomenclature for 5-degree squares - e.g. in SE sector: -40 -45 -50 140 145 150 3414 (1 is always closest to global origin, 4 is always furthest away. For full specification refer c-squares website) 1 2 3 4 WMO 10-degree square 3414 (grey) 5-degree square 3414:2 (light blue)

19 nomenclature for 1-degree squares - e.g. in SE sector: 100 110 120 130 140 350 360 370 380 390 101 111 121 131 141 351 361 371 381 391 102 112 122 132 142 352 362 372 382 392 103 113 123 133 143 353 363 373 383 393 104 114 124 134 144 354 364 374 384 394 205 215 225 235 245 455 465 475 485 495 206 216 226 236 246 456 466 476 486 496 207 217 227 237 247 457 467 477 487 497 208 218 228 238 248 458 468 478 488 498 209 219 229 239 249 449 469 479 489 499 -40 -45 -50 140 145 150 3414 WMO 10-degree square 3414 (grey) 5-degree square 3414:2 (light blue) 1-degree square 3414:227 (green) (100 is always closest to global origin, 499 is always furthest away. For full specification refer c-squares website)

20 Codes have straightforward relationship with lats/longs, mapsheets, etc.... e.g.: 3414:227 (1-degree square with origin at 42 º S, 147 º E) additional degrees E [140+7] =147 additional degrees S [40+2] = 42 5-degree quadrant, i.e. 1 2 3 4 tens of degrees E (i.e., 140) tens of degrees S (i.e., 40) global sector (1=NE, 3=SE, 5=SW, 7=NW) 110 km

21 example: 3212:*** can be used instead of specifying every 1-degree square within 10 degree square 3212. This leads to corresponding data reduction, e.g. Australia (at 1- degree resolution) can be described in 343 squares rather than 800: “quad tree” -type approach used where numerous adjacent squares are occupied

22 Example database-level implementation of c-squares for metadata records (e.g. at 1 degree resolution) (etc.)

23 automated conversion of lat/long data to c-squares (ignoring multiple hits) automated conversion of GIS polygon data to c-squares extents clickable map interface for region(s) of immediate interest manual entry, with reference to marked-up mapsheet/s on-line lat/long - to - c-square converter custom digitising system (graphics tablet data input or similar) Options for c-squares data entry... 3315:130:1 3315:130:2 3315:131:1 3315:130:4 3315:131:3 3315:130:3 mapsheet marked with 0.5 degree squares - for manual entry clickable map interface (generalised example)

24 c-squares strings can be transformed into coordinate pairs (centre point of squares) and square size, by an appropriate function and then sent to Xerox PARC Map Viewer or similar, e.g.: Process invoked for web mapping (1)

25 c-squares strings can be sent directly to the CMR c-squares mapper (accessible via the web), e.g.: Process invoked for web mapping (2)

26 (Base maps are automatically chosen to fit the data range, or can be selected manually) Further examples (0.5 deg. and 0.1 deg. squares):

27 c-squares spatial queries simply test whether a text string representing the search box (ideally one or several c-squares) is matched anywhere in the c- squares string … example: - search square 3113:2 will match any c-squares string which includes 3113:2 within it, e.g.: 3112:363|3112:370|3112:371|3112:380|3112:381|3112:390|3113:100|3113:101|3113:102| 3113:103|3113:104|3113:205|3113:206|3113:207|3113:216|3113:217|3113:228|3113:238| 3113:239 (NB, this is a simple text search and involves no arithmetic - cf. querying of bounding rectangles, polygons, or more complex spatial objects) hierarchical naming system for c-squares means that finer resolution squares are automatically picked up in any “coarser resolution” search Spatial queries using c-squares

28 …implementable as a simple “click on a square” interface, e.g.:

29 … system does the search - checks for c-squares match if available (gives confirmed “hits”), otherwise uses overlapping rectangles test (=“possible match”)... searching...

30 produces... (etc.)

31 Viewing the full metadata record produces... (etc.) with clickable link to show dataset extent using c-squares:

32 Base maps for displayed data can be changed at will by the user, e.g.: (numerous other maps available, sample only shown)

33 c-squares implementation in CMR’s “CAAB” Taxon dictionary (c. 15,000 marine species) - 3,000 now with maps using c-squares: - maps currently intranet only, will be publicly available after additional data loading and validation

34 Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) -9.0 -19.0 117.0 145.8 3111:499:2|3112:390:1|3111:489:3|3112:380:3|3112:380:4|3112:381:1|3111:488:2|3112:381:2| 3112:371:3|3111:478:4|3112:370:4|3112:370:1|3111:478:1|3111:479:2|3111:479:1|3112:361:4|3111:468:4|311 2:363:3|3112:361:3|3111:467:2|3112:360:2|3112:363:1|3112:362:2|3112:360:1|3112:352:4|3112:352:3|3112:3 50:4|3112:352:1|3112:351:2|3112:352:2|3112:353:2|3112:353:1 Franklin Voyage FR 10/87 CTD Data CSIRO Marine Research (etc. etc.) -9.0 -19.0 117.0 145.8 3111:499:2|3112:390:1|3111:489:3|3112:380:3|3112:380:4|3112:381:1|3111:488:2|3112:381:2| 3112:371:3|3111:478:4|3112:370:4|3112:370:1|3111:478:1|3111:479:2|3111:479:1|3112:361:4|3111:468:4|311 2:363:3|3112:361:3|3111:467:2|3112:360:2|3112:363:1|3112:362:2|3112:360:1|3112:352:4|3112:352:3|3112:3 50:4|3112:352:1|3112:351:2|3112:352:2|3112:353:2|3112:353:1 c-squares strings are suitable for inclusion as a new metadata element alongside “bounding box”, for example... … would permit interoperability with both enabled and non-enabled systems

35 7500:123:4 55 x 35 km 7500:123:455 11 x 7 km 7500:123 110 x 70 km (NB, “real” shape and dimensions vary according to position on globe) 7500:1 550 x 350 km WMO Square 7500 Actual size of c-squares, e.g. compared to U.K. : 7500 1100 x 700 km (approx.) 1- degree squares is suggested as a potential standard for global interoperability of metadata systems (finer resolution available to users on as-needs basis) - but can bulk up to 5- or 10-degree squares where appropriate to infill large areas 10 x 10 deg. 5 x 5 deg. 1 x 1 deg. 0.5 x 0.5 deg. 0.1 x 0.1 deg.

36 Summary - strengths and weaknesses of c-squares Strengths... “c-squares” is a concise and flexible method of encoding simple to moderately complex forms automated or manual code entry (and maintenance) is straightforward spatial searching is simple text string matching operation (no GIS involved) “c-squares mapper” utility available via simple web call can be used as adjunct to bounding coordinates searches Weaknesses … some other numbering systems in use (Marsden Squares, Maidenhead Locators) - needs willingness to standardise on proposed nomenclature for interoperability c-squares are not a fixed multiple of kilometres, miles, etc. may be cumbersome to encode very large, complex regions (e.g. “Pacific Ocean”) by this method - works best at continental scales and below.

37 other comments... “c-squares” notation is language-independent - can be equally used in English, French, Japanese … also discipline-independent (suitable for physical, biological, geological, topographical, plus any other data type) downwards-scalability of the c-squares notation means that it can be applied to any size region (e.g. local level) equally applicable to both terrestrial and marine data no equivalent in GML notation at this time (GML only supports vector data). Even if there were a GML equivalent, c-squares would still be a significantly more concise way to describe the equivalent spatial objects.

38 already implemented progressively in CMR “MarLIN” metadata system (c. 500 records to date, more continuously added) and in the CMR “CAAB” taxon management system (c. 3000 records). MarLIN c-squares search interface operational since February 2002 c-squares is freely available for implementation in any other agencies’ metadata systems without cost or technology overhead c-squares has potential to to be recognised as a formal metadata element by relevant user communities / national bodies current CMR c-squares mapper is already accessible for general use. External systems already linking to the c-squares mapper include OBIS (Ocean Biogeographic Information System, USA) and FishBase (ICLARM/FAO), as well as CMR’s MarLIN and CAAB databases c-squares website (www.marine.csiro.au/csquares/) is a focal point for all c-squares related materials - including specification, background information, sample code, on-line lat/long converter, example c-squares- enabled metadata records, and more c-squares current and future status...

39 Acknowledgements/Inspiration... Ken Walker (Museum Victoria) for introducing me to his Museum Victoria Bioinformatics search interface, based on 0.5 degree squares “Blue Pages” Marine and Coastal Data Directory (MCDD) for the notation for subdividing WMO squares, also for pointers to software for drawing rectangles on GIF images (as used in the c-squares mapper) and for point-and-click map searching CMR Data Centre staff for useful discussions Miroslaw Ryba (CMR) for programming assistance with the c-squares mapper John Hockaday (Geoscience Australia) and Doug Nebert (FGDC, USA) for feedback on prototype versions of the system NOAA “GLOBE” Project, and Martin Dix (CSIRO Atmospheric Research) for provision of backdrop images used in the c-squares mapper.

40 Questions, comments?


Download ppt "System concept and development by: Tony Rees Divisional Data Centre CSIRO Marine Research, Australia c-squares - a new method for representing, querying,"

Similar presentations


Ads by Google