Download presentation
Presentation is loading. Please wait.
Published byOswald Bruce Modified over 9 years ago
1
Google Refine for Data Quality / Integrity
2
Context BioVeL Data Refinement Workflow Synonym Expansion / Occurrence Retrieval Data Selection Data Quality / Integrity
3
Context BioVeL Data Refinement Workflow Synonym Expansion / Occurrence Retrieval Data Selection Data Quality / Integrity
4
In Google’s Own Words “Google Refine is a power tool for - working with messy data, - cleaning it up, - transforming it from one format into another, - - extending it with web services, - and linking it to databases”
5
In Google’s Own Words “Google Refine is a power tool for - working with messy data, - cleaning it up, - transforming it from one format into another, - - extending it with web services, - and linking it to databases” …. and can be run in isolation
6
Installation Download zip file from http://code.google.com/p/google-refine/wiki/Downloads Extract file Run google-refine.exe
7
Features Clustering / Grouping use case :group taxon name and merge similar groups
8
Features Filtering use case : filter out records which do not have ‘museum’ / ‘university’ / ‘marine’ in data provider name
9
Features Data Exclusion use case : exclude records that have been faceted / filtered
10
Features Extending Data use case :add ISO country code column use case :add column(s) by parsing taxon name
11
Features Reconciling Data use case :retrieve associated names from ‘WORMS’
12
Features Save / Replay User Actions use case :extract scientific names from name labels
13
Features Build Extensions use case :BioVeL Extension - interaction with Taverna - add additional functionality specific to the BioVeL context (e.g ECAT Name Parser)
14
Future Possibilities remote server could be deployed as a remote server with the possibility to use shared resources (extensions, data, history actions)
15
Future Possibilities integration with existing applications, either as a module or using REST API calls
16
Future Possibilities central application which can be used to run scripts, call web services and even interact with software applications
17
Thanks Questions / Suggestions / Comments
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.