Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences.

Slides:



Advertisements
Similar presentations
AUSTRALIA part of the Creative Commons international initiative
Advertisements

The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Creating a reading list in Moodle Learning Technologists, Centre for Learning Technology This work is licensed under a Creative Commons Attribution-ShareAlike.
Protection of Freedoms Act 2012 Understanding the impact of the changes 4 March 2013 Joy Tottman Governance and Compliance Officer.
DIY Research Data Management Training Kit for Librarians Ethics and copyright Robin Rice, Data Librarian EDINA and Data Library Information Services.
MSc Dissertation Preparation Session 2. Literature review The literature review is the means by which we establish what is already known and recorded.
Disclosing Freedom of Information Releases Ann Apps MIMAS, The University of Manchester, UK.
The UK Freedom of Information Act – A Practical Guide for Academic Researchers Cambridge Wednesday, 16 February 2011.
HSC: All My Own Work Copyright.
1 Mobile Platforms, Linked Content, and Copyright: Issues and Answers COPE North American Seminar 2014 Philadelphia, PA August 13, 2014 Michael W. Carroll.
Transparency in Public Administration – FOI and EIR
Towards a Freedom of Information Law in Qatar Fahad bin Mohammed Al Attiya Executive Chairman, Qatar National Food Security Programme.
Emerald platform tutorial. “Your Profile” “Your Profile” allows the user to create a personalized area where they can manage their research. To create.
A socio-technical model for content sharing
Step 2: Source Information Literacy. Source Where can you look for information to help answer your search question? Information sources can include people.
Research Week: Copyright, Commercialisation and IP Research Week: Copyright, Commercialisation and IP  opyright for postgraduate students and researchers.
Carol Tullo, The National Archives 14 April 2011 The Checks and Balances of a Transparent Public Sector World of Information.
Elma Graham. To understand what data protection is To reflect on how data protection affects you To consider how you would safeguard the data of others.
OCR Nationals Level 3 Unit 3.  To understand how the Data Protection Act 1998 relates to the data you will be collecting, storing and processing  To.
EASI a free web database application for collecting and managing monitoring records.
DATA PROTECTION & FREEDOM OF INFORMATION. What is the difference between Data Protection & Freedom of Information? The Data Protection Act allows you.
Improving Access to Government Datasets – a VAR’s perspective John Rae Director of Business Planning.
Data Protection Act & Freedom of Information Simon Mansell Corporate Governance and Information Team.
Understanding Search Engines What Is The Web? Web Search Lesson Plan Module A1.
Submitting Course Outlines for C-ID Designation Training for Articulation Officers Summer 2012.
OPEN UP! Introduction to handling Freedom of Information requests.
Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
IM NETWORK MEETING 20 TH JULY, 2010 CONSULTATION WITH 3 RD PARTIES.
ANONYMISATION Research Data Management. c Research Data Management Sensitive Data Sensitive Data is information covering: The racial or ethnic origin.
HSC: All My Own Work What is copyright and what does it protect? How does it relate to me?
Guidance and Training for School Admin Teams FINDING AND ATTRIBUTING OPENLY LICENSED RESOURCES.
Legal and copyright issues: experiences and advice Morag Greig.
Creating and Maintaining a Sustainable Research Data Management Service: Where Do Librarians Fit? Jill Evans, Gareth Cole & Hannah Lloyd-Jones (University.
To find journals by language of publication, click on the Languages bar in the horizontal frame. The Languages drop down menu appear and we will choose.
Issues in RDM This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0 International License.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
GCSE ICT Data and you: The Data Protection Act. Loyalty cards Many companies use loyalty cards to encourage consumers to use their shops and services.
Copyright Basics. What is Copyright? Right of authors and artists to control original work including reproduction and use. Protection provided by U.S.
Sharing OERs via Jorum Siobhán Burke and Sarah Currier 12 th December 2012.
Marion Kelt Copyright guidance for the production of ALC resources.
Scientific data storage: How are computers involved in the following?
Digitalcommons.unl.edu Archiving Department Records.
Freedom of Information Act ‘What you need to know’ Corporate Information Governance Team Strategic Intelligence.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Data Protection GCSE ICT Mrs N Steventon-2005.
Research and the Data Protection and Freedom of Information Acts
CREATIVE COMMONS FOR CULTURAL HERITAGE
SIMS Reporting Enhancement supporting GDPR
SIMS Reporting Enhancement supporting GDPR
An Introduction to Open Data
Attributing Images Web.
Introduction to electronic resources management
Keeping yourself right with copyright
Writing the Methods Section
Team Site Admin with SharePoint 2010
E-resource evaluation tips
SIMS Reporting Enhancement supporting GDPR
Using Journals’ Instructions to Authors
The new data protection rules
Disclosing Freedom of Information Releases
Big Data on the Web News Gathering.
Barbara Gastel INASP Associate
Which Projects Do – and do Not – Require IRB Review?
The CITI Program Mission Statement Training on regulations with CITI
Vancouver Public Library
Marion Kelt Copyright and images, or how not to be a pirate!
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

Data Management for Geoinformatics A short course on good data management for taught postgraduate students in geoinformatics and related data sciences. John Murtagh, UEL

Data Collection

Data sources 3

1. Finding data – this involves searching and finding data that has already been released 2. Getting hold of more data – asking for ‘new’ data from official sources e.g. through Freedom of Information requests. 3. Collecting data yourself – This means gathering data and entering it into a database or a spreadsheet – whether you work alone or collaboratively Sometimes data is public on a website but there is not a download link to get hold of it in bulk – but don’t give up! This data can be liberated with what datawranglers call scraping. More later…scraping

Finding already released data or…. open data

Open Data – a definition “A piece of data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.”

Where can I find open data?

Lots of places!

ScraperWiki …is an online tool to make the process of extracting "useful bits of data easier so they can be reused in other apps, or rummaged through by journalists and researchers.“ Most of the scrapers and their databases are public and can be re-used.

Geospatial data… A group for open geospatial data with an emphasis on use in teaching and research.

Data.ac.uk “a landmark site for academia providing a single point of contact for linked open data development.” It not only provides access to the know-how and tools to discuss and create linked data and data aggregation sites, but also enables access to, and the creation of, large aggregated data sets providing powerful and flexible collections of information.

Other developments

A number of startups are emerging, that aim to build communities around data sharing and re- sale. This includes Buzzdata and Figshare — a place to share and collaborate on private and public datasets — and data shops such as Infochimps and DataMarket.BuzzdataFigshareInfochimpsDataMarket DataCouch — A place to upload, refine, share & visualize your data.DataCouch The World Bank and United Nations data portals provide high-level indicators for all countries, often for many years in the past.World BankUnited Nations

An interesting Google subsidiary, Freebase, provides "an entity graph of people, places and things, built by a community that loves open data.“Freebase Research data. There are numerous national and disciplinary aggregators of research data, such as the UK Data Archive. While there will be lots of data that is free at the point of access, there will also be much data that requires a subscription, or which cannot be reused or redistributed without asking permission first.UK Data Archive

While they may not always be easy to find, many databases on the web are indexed by search engines, whether the publisher intended this or not. Here are a few tips:

Tips for searching for data (from the Data Journalism Handbook) When searching for data, make sure that you include both search terms relating to the content of the data you’re trying to find as well as some information on the format or source that you would expect it to be in. Google and other search engines allow you to search by file type.

For example, you can look only for… Spreadsheets (by appending your search with ‘filetype:XLS filetype:CSV’) Geodata (‘filetype:shp’) Database extracts (‘filetype:MDB, filetype:SQL, filetype:DB’). PDFs (‘filetype:pdf’).

You can also search by part of a URL. Googling for ‘inurl:downloads filetype:xls’ will try to find all Excel files that have “downloads” in their web address Another popular trick is not to search for content directly, but for places where bulk data may be available. (if you find a single download, it’s often worth just checking what other results exist for the same folder on the web server). You can also limit your search to only those results on a single domain name, by searching for, e.g. ‘site:agency.gov’. For example, ‘site:agency.gov Directory Listing’ may give you some listings generated by the web server with easy access to raw files, while ‘site:agency.gov Database Download’ will look for intentionally created listings.

Getting hold of new data

The information requested must be provided unless an exemption or exception allows the institution not to disclose it. The request could be addressed to anyone in the University organisation, & there are only 20 working days to respond. Freedom of Information (FoI) & Environmental Information (EIR) legislation provides the public with a right to access information (also research data) held by a UK public authority, which includes most universities, colleges, or publicly-funded research institutions.

You can make an FOI request using a website whatdotheyknow.com

Statistics and Registration Services Act 2007 The Act is mainly concerned with the UK Statistics Authority and applies only to data designated as Official Statistics. It defines how 'personal information' can be disclosed to an 'Approved Researcher' i.e. an individual to whom the Statistics Authority has granted access, for the purposes of statistical research, to personal information held by it. Although the Act does not apply to individual researchers managing confidential research data not designated as Official Statistics, such researchers might wish to adapt the Approved Researcher model for access to confidential data.

Environmental Information Regulations 2004 This Act gives the public access rights to environmental information held by a public authority (including universities) in response to requests (similar to the Freedom of Information Act). Freedom of access does not imply free access. There are circumstances under which requests may or must be refused, for example if the data contain personal information.

Collecting data yourself – This means gathering data and entering it into a database or a spreadsheet – whether you work alone or collaboratively.

Getting data in the format you need it Finding more Data using Google You can search for CSV files on Google by typing +filetype:csv in the search bar. Searching for "South Africa +filetype:csv" will result in CSV files mentioning South Africa. You can try different other filetypes as well (such as: "xls" for excel spreadsheets or "pdf“)

Permissions and Licensing data

Licensing of Open Data - reuse Public domain dedications, which also serve as maximally permissive licenses; there are no conditions put upon using the work; Permissive or attribution-only licenses; giving credit is the only substantial condition; Copyleft, reciprocal, or share-alike licenses; these also require that modified works, if published, be shared under the same license. As defined by Open Knowledge Foundation

Using data

Ethics of carrying out research with data University's Research Ethics Committee (UREC) has specific responsibility for institutional oversight of matters relating to ethics and governance in research undertaken by both staff and postgraduate research students that involves human participation personal sensitive data or human material. Further information from the Quality Assurance and Enhancement officer

Collecting personal or sensitive data The Data Protection Act (1998) covers personal or sensitive personal data, but not to all research data in general, nor to anonymised data or if the participants are no longer living

Questions to ask yourself in your data creation. 1.Is personal data needed? Names and addresses, for example? Store this data, if required, separately. 2.Inform your participants about use of personal data. 3.Not all research data obtained from participants constitute personal data. If data are anonymised!

Lastly: To gain access to police data or records you may be subject to a Disclosure or a DBS check (previously CRB check) which provides details of any criminal record data held on you.criminal record barring-service

Other sessions as part of Data Management in Geoinformatics: Data Integration Data Management Data Sharing Data Management for Geoinformatics by John Murtagh as part of the Jisc funded project TraD (University of East London is licensed under a Creative Commons Attribution Share Alike Licence