Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amanda Whitmire Maura Valentino OSU Libraries OPP Workshop Series 5 December 2012.

Similar presentations


Presentation on theme: "Amanda Whitmire Maura Valentino OSU Libraries OPP Workshop Series 5 December 2012."— Presentation transcript:

1 Amanda Whitmire Maura Valentino OSU Libraries OPP Workshop Series 5 December 2012

2 Why is a Librarian asking? We are curious. We manage information. Data are a kind of information.

3 TAKING CARE OF YOUR DATA What’s your plan?

4 GOAL: Achievable habits for implementing data management best practices into your workflow

5 “…the recorded factual material commonly accepted in the scientific community as necessary to validate validate research findings.” Research data is: U.S. Office of Management and Budget, Circular A-110

6 long-term “…management activities required to maintain research data long-term such that it is available for reusepreservation reuse and preservation.” Data curation is: Wikipedia CURATION ≠ ARCHIVAL

7 available “It is obvious that making data widely available is an essential element of scientific research.” Science editorial, “Making Data Maximally Available,” 11 Feb 2011

8 The case for data management stewardship curation etc. $

9 Common missteps “Why can’t I open this WordPerfect document?” “I think those data are on a ZipDisk somewhere…” “Oh, that dataset is on our group server…” “I never actually gave my advisor the final dataset…” “My laptop got stolen, so I lost the data…” “It was so long ago, I can’t remember …”

10 Research data lifecycle New research question posed Research planning & design Data collection & description Data processing & analysis Dissemination & publication of findings Data archiving Accessible data located Data transformed / repurposed Research Cycle

11 How can we help? New research question posed Research planning & design Data collection & description Data processing & analysis Dissemination & publication of findings Data archiving Accessible data located Data transformed / repurposed Research Cycle

12 Where to start? How much data? Resources needed Roles & responsibilities Metadata Data formats Data storage Ethics & consent Copyright (open data) Sharing Make a plan. Consider:

13 A few tidbits

14 Data storage & curation Anticipate: Volume/File type(s) Raw data vs. processed/analyzed data File Naming Conventions Privacy Concerns Storage practice Backup plans (LOCKSS, checksums)

15 File naming conventions 1. Be consistent Have conventions for naming: (1) Directory structure (2) Folder names (3) File names Always include the same information (e.g. date and time) Retain the order of information (e.g. YYYYMMDD, not MMDDYYY ) 2. Be descriptive Try to keep file and folder names under 32 characters example: Project_instrument_location_YYYYMMDDhhmmss_extra.ext SG157_20100426_001.raw (raw data)  SG157_20100426_001.mat (working data)  ESPOMZ_SG157_20100426_001.txt (shareable)

16 Legal and ethical considerations Intellectual property Office for Commercialization & Corporate Development (OCCD) Copyright Licensing Charging for data? Data attribution & citation Human subjects?  Informed consent & anonymization prior to publishing Resources @ OSU: Office of Research Integrity, Institutional Review Board (IRB) Responsible Conduct of Research (RCR) Program

17 Archiving and preservation Policies Preservation options Types of repositories Costs and benefits

18 University of Southampton School of Electronics & Computer Science Southampton, UK, 2005 A word about backups…

19 Metadata “The metadata accompanying your data should be written for a user 20 years into the future -- what does that person need to know to use your data properly? Prepare the metadata for a user who is unfamiliar with your project, methods, or observations.” Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC)

20 What is Metadata? Metadata is “data about data” WHO created the data? WHAT is the content of the data? WHEN were the data created? WHERE is it geographically? HOW were the data developed? WHY were the data developed?

21 Metadata schemes Dublin Core (DC), Darwin Core (DwC), EML, DDI, NBII, FGDC/CSDGM, ISO 19139, ISO 19115, DIF, LDIF, e-GMS, AGLS, METS, MODS, PREMIS, OAI-PMH, MARC, CDWA, CIDOC/CRM, DACS, DIG35, GILS, GML, ISBD, LCSH, KML, MARCXML, MEI, MODS, MIX, OAIS, ANSI/NISO Z39.88, PB Core, PRISM, QDC, RDF, SGML, VSO, XML, XMP X

22 Metadata schemes “Metadata schemes are like toothbrushes – everybody agrees that you should use one, but nobody wants to use someone else’s.”

23 You already use metadata… -23 87 48

24 Metadata in use StateCityLocationDateTimeTemperature (F) AlaskaAnchorageCity Hall2/12/20101400 -23 FloridaMiami Weather Center 2/12/20101400 87 New York Empire State Building 2/12/20101400 48

25 Metadata in real life You use it all the time…

26 Darwin Core | biological diversity, taxonomy Dublin Core | general DDI (Data Documentation Initiative) | social and behavioral sciences data DIF (Directory Interchange Format) | environmental sciences EML (Ecological Metadata Language) | ecology FGDC/CSDGM (Federal Geographic Data Committee/Content Standard for Digital Geospatial Metadata) | geographic data NBII (National Biological Information Infrastructure) | biology Major metadata standards http://sbc.lternet.edu/cgi-bin/showDataset.cgi?docid=knb-lter-sbc.10

27 Metadata activity! Take it away, Maura…

28 Let’s Describe this Dataset Bright orange Garibaldi fish Hypsypops rubicundus California, USA Ornate Butterfly fish Chaetodon ornatissimus Indo-Pacific

29 Scenario 1 Research for preschoolers to see if they learn colors and patterns better from real life examples

30 Scenario 2 Research on what fish are local to a particular area. The photos are the data

31 Scenario 3 Research into specific details of specific types of fish

32 File/Folder Organization You have monitors attached to 18 athletes (6 tennis players, 6 golfers, 6 rowers) for 7 days. Each day you get 2 readouts for each athlete, 1 for heart rate and 1 for body temperature. You transfer the data to Excel. Name and organize the files for this experiment.

33 Think about your own data –What types of data need to be described? –What are the relationships between them? –What descriptive metadata can you find? –What metadata is being captured automatically? –What other descriptive metadata do you need to help users find your data? –What metadata do you need to help other scientists reproduce your data or use it for comparison? –What events has/will the data undergo? –For how long do you want to retain the data? –How intensive are your preservation needs? –How diverse is your user base? Does this influence your preservation needs?

34 Data Management Plans

35 The types of data Data & metadata standards | format and content Policies for access and sharing Policies and provisions for re-use Plans for archiving data {Budget} $ $ $

36 Use available resources http://www.dataone.org/d ata-management- planning https://dmp.cdlib.org/

37 Contact information Amanda Whitmire | Data Management Specialist amanda.whitmire@oregonstate.edu Maura Valentino | Metadata Librarian maura.valentino@oregonstate.edu

38 fin


Download ppt "Amanda Whitmire Maura Valentino OSU Libraries OPP Workshop Series 5 December 2012."

Similar presentations


Ads by Google