Presentation is loading. Please wait.

Presentation is loading. Please wait.

How is data generated? 1 How to researchers get data?  Recap ‘What is data’  Methods of data collection, setting down, storage  Varies across disciplines.

Similar presentations


Presentation on theme: "How is data generated? 1 How to researchers get data?  Recap ‘What is data’  Methods of data collection, setting down, storage  Varies across disciplines."— Presentation transcript:

1 How is data generated? 1 How to researchers get data?  Recap ‘What is data’  Methods of data collection, setting down, storage  Varies across disciplines  Sliding scale of accessibility and formality  How to guide – example cheat sheet

2 Defining research data Examples  Numbers  Words/texts  Survey results  Interviews  Machine readings  Voice recordings  Voice transcripts  Images  Video  Sound  Artifacts  Specimens  Samples (medical, paleo, geo, …)  … 2 Collection method → Quantitative → Qualitative (& quant) → Quant/Qual → Quantitative → Qualitative → ? → Qualitative → ? Other terms Observational? Mixed methods? Secondary? Case study? Cross-sectional? Longitudinal? ‘Big data’

3 Defining research data 3 Different data formats (should be documented!)  Raw  Transcribed  Converted (in format, by analysis)  Derived (e.g., confidentialised, de-sensitised)  Physical or Digitised  Single, multiple, combined datasets  Same ‘research input’ may have multiple data outputs (e.g., ancient/historical scripture – image, digital image, transcription, interpretation)

4 4 Common features of data  ‘building blocks of information’  As information varies with discipline, so do the main kinds of data and methods of collection http://www.dcc.ac.uk/sites/default/files/documents/publications/DCC_Howto_Discover_Requirements.pdf  E.g., Medical science: bloods + readings = disease presence  E.g., Anthropology: recorded interviews + observations = cultural practices

5 What is data, recap  Formats: Can be physical/analog (e.g. paper) or digital (e.g., Papyrology can be both)  Original or transcribed/described/representative  Methodology – cross-sectional vs. longitudinal, survey vs. administrative  Can be created by and for a range of people and services 5 Data questions?

6 How and where data is stored  Data storage vs. metadata  Continuums of data storage  *Does not necessarily relate to accessibility 6 Formal (conventions around capture, vocab) Informal (much variability) Stores/repositories Individual researcher Screenshot from: ada.edu.au [Accessed 28/04/2014].

7  Cultural institutions  Researchers  On institutional file storage networks or portable media  Captured by third parties - storage or social media service providers, e.g. DropBox or Flickr, Figshare, or data repositories, e.g. Australian Data Archive (NCI, RDSI), VicNode (RDSI)  More examples of databases/repositories after lunch 7 Who manages stored data?

8  Continuums of metadata storage 8 Formal Informal Registries/Commons Project website Screenshot from: researchdata.ands.org.au [Accessed 28/04/2014].Screenshot from: rsha.anu.edu.au [Accessed 28/04/2014].

9  Accessibility & quality of metadata and data don’t align 9 Public-Public (Open access) Public-Private (Mediated open access) Private-Private (Closed access) Metadata is fully discoverable Metadata is not publicly available Data are accessible and immediately downloadable Mediated access to data via data custodian Data not discoverable or available to third parties Preferred option for non- sensitive data from completed projects Good option for sensitive or confidential data Safest option for highly- sensitive data http://libguides.library.curtin.edu.au/

10 10

11 11 Accessing data When this might be harder – Sharing and accessing sensitive data

12 Getting data How do people Find/Discover data?  Movable feast / changing beast  No established methods like other scholarly outputs  No standard practice or vocab  Databases are non-exhaustive  Methods for searching and terms driven by why people are looking (e.g., may start with direct contact from a project website)  and subject matter as well as methodology, accessibility etc. 12

13 1.Have you already identified the data or exploring? 2.Search formal databases (public/private mix):  Research Data Australia (RDA), Australian Bureau of Statistics (ABS), Australian Data Archive (ADA), Figshare, Trove  data.gov.au, data.gov, data.gov.uk  http://databib.org/index.php  Think about search terms by data topics AND characteristics 3.Informal searching:  ‘Googling’  From publications  Peer networks  Cold calling 13 Finding data Why metadata counts!

14 Case study  Student approaches ANU library staff to access Child and Adolescent Component (1998) of the National Survey of Mental Health and Wellbeing after reading an study that uses the data  Google locates researcher in WA…  ….who says data is in Australian Data Archive….in Canberra  (but have to know to look there! – not found via google search)  Link to request permission for license (once register with ADA) 14

15 Accessing data  So you’ve found an interesting dataset. How do you GET it?  Repository catalogue entries (derived from metadata) will typically provide info about how to obtain the data  …or at least a contact…  Access varies depending on access policy of the owner 15 Open Access (public/public) No access Download from website Highly sensitive data (e.g., not de-identified medical records) Conditional/Mediated (public/sort-of private) May need to pay fee and/or sign contract Why metadata counts!

16 Conditional or mediated access to data May be held by:  Custodian of data  Login or approval required (e.g., ADA)ADA  Licenced = reuse is (legally) conditional  AusGoal  Organisational licenses (or repository or data manager) 16 What is a license?

17 AusGoal licences  Australian Government Open Access and Licensing Framework  Ready-made licences with legal surety. Endorsed by CAUL.  Apply least restrictive  6 levels of Creative Commons license  Least restrictive = CC BY (Default Licence for Aust Govt)  Most restrictive = CC BY-NC-ND  Restricted License (template) - for data that contains personal or other confidential information 17

18 Sensitive data  Sensitive data is data that can be used to identify an individual or object to place them at risk of discrimination/harm or unwanted attention  Invokes law (Privacy Act) and research ethics  Examples:  Survey data including names and criminal records  Hospital records  Location of endangered species  * sensitive by context 18

19 Can sensitive data be shared?  Typically, Yes!  But How? When?  When consent is explicitly given, and/or  When data is de-sensitised (‘de-identified’)  When data is modified  When an appropriate license is applied  Different issues when data is new vs. existing 19

20 Stay tuned… ANDS Guide to Sharing Sensitive Data Safely is on the way 20

21 Case Study A group of researchers at University of Timbuktoo were interested in the links between mental health, activity, and internet use in young people. They surveyed 986 young people aged 16-20 years. The survey asked about their age (DOB), school, physical and mental health, eating habits, physical activity, computer/internet use, educational achievement, family structure and parents’ cultural background. Paper surveys were used and then destroyed when the data was entered into an electronic database. The researchers would like to make their data available to other researchers – particularly to forms new collaborations and link with similar datasets on young people. 21

22 1.Is the data sensitive? 2.Barriers to sharing/publishing 3.What can be done now towards sharing? 22

23 23 Barriers/issuesSolutionsTo look into 1. 2. 3. …


Download ppt "How is data generated? 1 How to researchers get data?  Recap ‘What is data’  Methods of data collection, setting down, storage  Varies across disciplines."

Similar presentations


Ads by Google