Download presentation
Presentation is loading. Please wait.
Published byLaura Fields Modified over 8 years ago
1
DATA MANAGEMENT: WHAT IT MEANS FOR YOUR RESEARCH Maggie Howell 02 June 2016 1
2
Data Management: what is it? 2
3
It covers more than you think… Any data standards and regulations Data management planning Data storage & retrieval Data transfer Data science –Analytics –Visualization –Machine learning –High performance computing –Data mining Governance & stewardship* *What is a data steward? Who is a data steward? 3
4
Why does it matter? Research 101: “reproducibility” Decreased time and effort spent on menial/routine tasks Cost (employees and facilities) Contextualized information makes research more informed, encourages innovation It’s professional Open access The exponential growth of information (the “information explosion” and “Big Data”) –NASA’s amount of data: think about how much is generated from one test alone It’s more secure Our most important resources: 1) Workforce; 2) Data. –What does NASA produce, work with every day…? Bottom line: we all need to be able to 1) find that thing we’re looking for; 2) understand it; and 3) be able to use it. 4
5
When you don’t practice data management… Data are hard to find Data are inaccessible—due to permissions or, more likely, not in a place you can actually retrieve them Data are hard to understand (no context) Data are in “difficult” formats Data are inconsistent—perhaps there’s no naming convention applied Data are organized seemingly incoherently …am I missing anything? To show and solve these problems, we need the support of researchers and need to have certain demonstration of serious problems. 5
6
The current situation at Glenn: Unregistered, unprotected personal devices Little to no data management planning Ad-hoc versioning Disparate data Data with no context Access and sharing difficulties Some case studies (problems and solutions): –From the Materials division –From the Acoustics branch 6
7
Why is this happening and why doesn’t it get better? A study from Blazent, Inc. 7
8
The current situation elsewhere: The good Data management investment Data management requirements –NSF, NIH, other grants. –Contracts The bad/ugly Sony, Target, Health Net: legal consequences Time, money, resources wasted 8
9
A top NSF priority “Big Ideas,” by Director France Córdova. Document here: http://www.sciencemag.org/sites/default/file s/documents/Big%20Ideas%20compiled.pdf http://www.sciencemag.org/sites/default/file s/documents/Big%20Ideas%20compiled.pdf 9
10
Efforts up to this point: NIAM iRODS Individual divisions’ (and simply individual employees’) efforts –Materials, Granta 10
11
HOW CAN THIS MAKE YOUR LIFE EASIER? What do you do now? 11
12
Data Management Planning: Sit down with your team before a project (if you’re past that, then during), and make some decisions about: –Organization & naming; –Data types & formats; –Access permissions; –Versioning & replicas; –Metadata & having an index; –Disposition; –Storage. This may sound tedious, but this will save you and your team a great deal of work (that could add up to weeks of time previously lost)—guaranteed. You can only gain from this. 12
13
Beyond the basics: tools & resources Ready-made ontologies* and controlled vocabularies, like: NASA Thesaurus : http://www.sti.nasa.gov/sti-tools/#.VzDmp4QrJhEhttp://www.sti.nasa.gov/sti-tools/#.VzDmp4QrJhE HIVE: “Helping Interdisciplinary Vocabulary Engineering”: http://hive.cci.drexel.edu:8080/home.html http://hive.cci.drexel.edu:8080/home.html –Vocabularies for many domains Vocab/ontology builder: protégé, http://webprotege.stanford.edu/http://webprotege.stanford.edu/ –Make sure to download it, don’t use web version! –http://protegewiki.stanford.edu/wiki/Engineering_ontologieshttp://protegewiki.stanford.edu/wiki/Engineering_ontologies Scientific workflow* applications, like: Kepler, https://kepler-project.orghttps://kepler-project.org Taverna, www.taverna.org.ukwww.taverna.org.uk Pegasus, https://pegasus.isi.eduhttps://pegasus.isi.edu Cleaning/processing tools: OpenRefine, http://openrefine.org/http://openrefine.org/ Data management planning tools: DMPTool, a collaborative effort between UC3, DataONE, Digital Curation Centre, Smithsonian, UCLA, UCSD, Uillinois, UVA. https://dmptool.org/ https://dmptool.org/ *What is an ontology? *What is a scientific workflow? 13
14
Beyond the basics, part II: Regulations https://niam.nasa.gov/strategy/ mandates-and-procedural- requirements/ https://niam.nasa.gov/strategy/ mandates-and-procedural- requirements/ NPR 2200.2C Requirements for Documentation, Approval, and Dissemination of NASA Scientific and Technical Information NASA Records Management Info & document NPR 1441.1E NPR 2800.1B Managing Information Technology NPR 2210.1C Release of NASA Software NPR 8735.1C Procedures for Exchanging Parts, Materials, Software, and Safety Problem Data Utilizing the Government- Industry Data Exchange Program (GIDEP) and NASA Advisories NID 7120.99 NASA Information Technology and Institutional Infrastructure Program and Project Management Requirements NPR 1080.1A Requirements for the Conduct of NASA Research and Technology (R&T) FOIA Executive Order EO 13642: “Making Open and Machine Readable the New Default for Government Information” 14
15
QUESTIONS? What can data management address that would help you with your work? I will make these slides available to you. 15
16
Contact: Maggie Howell mary.m.howell@nasa.gov Phone: 3-8991 B142, Rm 223 16
17
Resources & presentation source list: NIAM site: https://niam.nasa.gov/https://niam.nasa.gov/ NASA Data Strategy, white paper: https://niam.nasa.gov/wp- content/uploads/2015/10/Data-Strategy-2015-03-17- White-Paper-Small.pdf https://niam.nasa.gov/wp- content/uploads/2015/10/Data-Strategy-2015-03-17- White-Paper-Small.pdf MIT Libraries, “Data Management,” http://libraries.mit.edu/data-management/ http://libraries.mit.edu/data-management/ Cornell University Research Data Management Service Group, “Best Practices,” http://data.research.cornell.edu/content/best-practices http://data.research.cornell.edu/content/best-practices NOAA, “Scientific Data Stewardship,” http://www.ncddc.noaa.gov/activities/science- technology/data-management/ http://www.ncddc.noaa.gov/activities/science- technology/data-management/ DataONE, “Best Practices,” https://www.dataone.org/best-practices https://www.dataone.org/best-practices 17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.