The IPT user interface and data quality tools Author: Arnaud Réveillon, Technical Writer. Image by G. Marcus, obtained in the stock.xchng site ( Alberto GONZÁLEZ-TALAVÁN Programme Officer for Training GBIF Secretariat GBIF NODES Committee Meeting Copenhagen, Denmark 4th October 2009
SUMMARY IPT General Features: basics about the IPT IPT Requirements : What do you need to have an IPT instance running? The data interfaces: how is data exposed when connected to the IPT. The user interface: how data is displayed to visitors. Customisation and internationalisation. Expanding the IPT: are you interested in your IPT handling other kinds of data? Discussion, questions and answers. The GBIF Integrated Publishing Toolkit
SUMMARY IPT General Features: basics about the IPT IPT Requirements : What do you need to have an IPT instance running? The data interfaces: how is data exposed when connected to the IPT. The user interface: how data is displayed to visitors. Customisation and internationalisation. Expanding the IPT: are you interested in your IPT handling other kinds of data? Discussion, questions and answers. The home page Exploring the datasets (metadata repository) The resources’ description Accessing the contents The data quality tools and annotation system The GBIF Integrated Publishing Toolkit
The IPT user interface and data quality tools INTRODUCTION This is an overview of the public side of the GBIF IPT, including an introduction to user annotations (not yet implemented). The GBIF IPT differentiates from previous biodiversity data providing tools in its public interface, that immediately exposes the data published in the internet for public access in an attractive way. At the same time, the GBIF IPT performs a series of analysis that enables the data curator to identify inconsistencies in the datasets and perform basic data quality checks. This topic was presented by Alberto González-Talaván, Programme Officer for Training at the GBIF Secretariat. The IPT user interface and data quality tools
The IPT user interface and data quality tools HOME PAGE The Home page of an IPT instance has different sections: The main menu * The search box * The title given to the instance. Its description (HTML based). The external image. A list of hosted resources, with direct links to these resources’ description page. The footer, including contact information * Points 3, 4, 5 and 7 are defined by the administrator in the settings section. * Common for all pages The IPT user interface and data quality tools
The IPT user interface and data quality tools EXPLORE Clicking on ‘Explore’ in the menu will lead to the ‘METADATA REPOSITORY’ section. It includes: A global interactive map that includes boxes delimitating the geographical area covered by each of the resources. A keyword cloud and index. Clicking on an item performs a search for that term. A list of resources, that can be filtered based on the type of resource (occurrence / checklist) The IPT user interface and data quality tools
The IPT user interface and data quality tools EXPLORE The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION Clicking on a resource name takes you to that resource information page. It includes: The description box Two trees to browse the taxonomic and geographic contents The general statistics box The interactive map A number of interactive charts and maps showing a brief analysis of the contents of the dataset. The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: The description box, with the description text and photo as defined in the metadata of the dataset and the total number of records and date of publication. It includes buttons to access annotations, full metadata and webservices. The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: The taxonomic and geographical trees, that allow to directly browse the contents of the dataset. The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: Some Geographic and taxonomic statistics (first hints about data quality: did I map my fields correctly?) The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: A interactive map with points for the geographical occurrences (second opportunity for data quality: do the points fall in the correct area? Are there any outliers? The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: A number of pie charts showing analysis by the following criteria: geographical, taxonomic, basis of record and hosting body. The pie charts views can be customised via drop-down menus. The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION A resource page summarises all the information that is available for this resource: Two extra charts giving you information about the year of collection and the country of collection (more opportunities for data cleaning and finding outliers). The IPT user interface and data quality tools
The IPT user interface and data quality tools RESOURCE DESCRIPTION All the charts can be clicked to give access to a page with more precise details on the topic: It is still possible to analysis following other criteria. Record counts for each category are displayed. Export options (SCV, MS Excel, XML and PDF). These statistics and graphics can be very useful for data publishers and administrators to easily prepare reports and summaries. The IPT user interface and data quality tools
The IPT user interface and data quality tools ACCESSING CONTENTS Different ways to access the contents of the datasets: Browse the taxonomy and geography trees. Click on records counts in the charts. Use the search box Navigating through resources will allow you to drill down to the requested information, down to the record level. Clicking on a taxon name or a count number will lead you to a summary page with the records matching that criteria. The IPT user interface and data quality tools
The IPT user interface and data quality tools ACCESSING CONTENTS In the search result page we can find: Information about the taxon concept or the geographical area (with links) Interactive map. Taxonomic and geographic trees List of individual resources displaying different details. From the list we can have access to the individual records. The IPT user interface and data quality tools
The IPT user interface and data quality tools ACCESSING CONTENTS The taxon concept summary page includes basic information about the taxon and the number of occurrences related to that taxon and branch. The IPT user interface and data quality tools
The IPT user interface and data quality tools ACCESSING CONTENTS The record page displays all the indexed fields for the record, plus a interactive map showing the location for those records with coordinates. The IPT user interface and data quality tools
The IPT user interface and data quality tools All the summary charts and analysis give the data curators/publishers hints to identify typing mistakes, coordinate errors (misplaced points, outliers), fields not correctly mapped... from the very moment of publication. The IPT user interface and data quality tools
The IPT user interface and data quality tools The ANNOTATIONS are a feedback and data quality control mechanism both for the administrators and the data managers. They can be produced: Automatically by the system: wrong data types, wrong references, etc. Manually by authenticated users: data quality, incomplete data - not yet implemented. The IPT user interface and data quality tools
The IPT user interface and data quality tools CONCLUSIONS The GBIF IPT provides a public web interface to the data that exposes the contents immediately in an interactive way. When new datasets are mapped to an IPT instance, a number of analysis are performed that reveal inconsistencies and data quality issues in the dataset. The result of these analysis can be used to easily produce dynamic reports and summaries of the data. The system is provided with an annotation system that allows the system and the users to provide feedback and data quality hints on specific records. The IPT user interface and data quality tools
The IPT user interface and data quality tools Alberto GONZÁLEZ-TALAVÁN Programme Officer for Training GBIF Secretariat Universitetsparken 15 DK-2100 Copenhagen, Denmark Tel: +45 3532 1483 Fax: +45 3532 1480 Email: Web: Author: Arnaud Réveillon, Technical Writer. Image by G. Marcus, obtained in the stock.xchng site ( GBIF NODES Committee Meeting Copenhagen, Denmark 4th October 2009