Data management A cornerstone for responsible conduct of research /8/20151RCR - data management
Core areas – Research integrity Data management Animal subjects Human subjects Conflicts of interest Peer review Collaboration Publication/authorship Mentorship Research misconduct 8/8/2015RCR - data management2
Outcomes Understand rules of data management relative to responsible conduct of research Understand roles and responsibilities of research staff Develop communication plan for dealing with data management issues Implement a system for responsible data management 8/8/20153RCR - data management
DATA MANAGEMENT OVERSIGHT Requires time and effort of the project PI (principle investigator) PI ensures that research team plans, implements and maintains data management policies and procedure 8/8/2015RCR - data management4
DEFINITIONS/TERMS 8/8/2015RCR - data management5
Key concepts Data Ownership: who has legal rights to data and who retains rights; can the PI transfer data to another institution Data Collection: reliable collection of project data in consistent, systematic manner + ongoing system for recording changes (validity) Data Storage: enough data should be stored to allow results to be reconstructed Data Protection: protection written and electronic data from physical damage; protection of data integrity, including tampering, theft 8/8/2015RCR - data management6
Key concepts Data Retention: time for keeping project data + secure destruction of data Data Analysis: how raw data are chosen, evaluated, and interpreted for other researchers and the public Data Sharing: how data and results are disseminated to other researchers and the public Data Reporting: publication of conclusive findings (positive and negative) after project completion 8/8/2015RCR - data management7
Defining data Any information, observations associated with a specific project, including experimental samples, technologies, and products Any factual information used as a basis for reasoning, discussion or calculation Examples: instrument scans, survey responses, measurements, observations, and data sources 8/8/2015RCR - data management8
DATA OWNERSHIP Who controls and has rights to the data. Who manages and uses the data. ‘Ownership’ is complex, and involves the PI, the project team, the sponsoring agency, and research institution, and any participants 8/8/2015RCR - data management9
Data ownership entities Sponsoring institution (university, research firm). Usually has ownership (employs the PI), controls funding and is responsible for ensuring the funded research is conducted ethically. PI has stewardship over project data, controls research directions, publications, copyrighting, patenting, subject to institutional review. Funding agency (NSF, NIH, …) federal agency, foundations, private industry. Agencies may influence publication and marketing of data. Principal investigator. Generator and steward of the project data, often retains rights to ownership. Industry usually owns the data. In academic situations, PI may be able to take research and data with them. There are data transfer policies. Research subjects. May have partial ownership of data, or some control over the research results. 8/8/2015RCR - data management10
DATA COLLECTION Data collection provides the information necessary to develop and justify research. A successful project collects reliable and valid data. 8/8/2015RCR - data management11
Data Collection Data collection should: – Enable researchers to analyze and assess their work, – Allow independent researchers to replicate the process and evaluate results – Enable the team members to do good data management – Detail the rationale behind the project design – Support decisions on expenditures and project directions – Yield reliable data, valid results and test the hypothesis or questions Includes recorded information, how it is recorded, and the research design. 8/8/2015RCR - data management12
Data collection objectives Team members should know: – Research purpose – Methodologies chosen – Implementation of methodologies – How the data were collected and analyzed – What expected/unexpected results occurred – What expected/unexpected errors occurred – Results significance and future directions Data collection guidelines and methodologies are part of the research design. Data collection should be consistent and systematic. 8/8/2015RCR - data management13
Collecting valid data Diligent record keeping is essential Some projects keep written and electronic records Record keeping: bound notebooks (errors marked, dated but not erased). Large projects require other methods Electronic records: security can be an issue Policy/procedure: know the project’s design, guidelines, standards for data 8/8/2015RCR - data management14
How to keep records Notes: you must be able to reconstruct the work and justify your findings. What worked, didn’t work, observations, commentary. Communicate your notes. Personal notebooks: entries should be chronological, consistent. Start a new page for each day; no blank lines between entries Error notation: entries should be indelible. Mark and date changes. What to record: anything that seems relevant to the project, i.e., data/time, team members who did the work, materials, instruments, software, data and observations Transferring data: verify that data has been correctly entered, particularly in electronic media 8/8/2015RCR - data management15
DATA STORAGE Storing data safeguards your research and your research investment. You must be able to recreate findings, augment subsequent research, or establish a precedent. You must be able to reconstruct a project and its findings. 8/8/2015RCR - data management16
Data storage Type/amount of data to retain: be able to reconstruct the project. Information should include raw data, statistics/analyses, notes, observations, products or specimens. Electronic data: thorough documentation to allow future use, easy storage format, back up system. 8/8/2015RCR - data management17
DATA PROTECTION The data storage plan should include security issues. Protection usually includes limiting access to the data. Electronic data storage requires additional safeguards. 8/8/2015RCR - data management18
Data protection Limited access: who is authorized to access and manage stored data, questionnaire privacy, … Electronic data: – Access: IDs, passwords, centralized process, wireless access? – System protection: anti-virus, up-to-date software, firewall, – Data integrity: record original creation date, use watermarking or encryption, back up files, ensure destruction when desired 8/8/2015RCR - data management19
DATA RETENTION Sponsor institutions and funding agencies often have specific requirements for how long data should be retained. PI decides when to end storage. 8/8/2015RCR - data management20
Data storage How long should data be kept: DHHS requires that data be kept for 3 years after the end of the funding period. Continued storage: $ for storage vs. future need. Data destruction: shredding, electronic data – Erase, CyberScrub 8/8/2015RCR - data management21
DATA ANALYSIS Data analysis methods must be appropriate for the project’s needs and objectives. Team members should know the data analysis methods. 8/8/2015RCR - data management22
Data analysis Methods: select methods appropriate to the research setting, type of research, and research objectives. Data analysis methods are usually an essential part of the project design. Team member responsibilities: all members should understand the data analysis plan and be able to interpret the results in the context of their study. Converting raw data to meaningful content: choosing, evaluating and expressing your data. 8/8/2015RCR - data management23
Data analysis design Analysis methods: accepted standards in field of study – data form, assumptions,…; link significance to causation (NIH),; Data use: – outliers, missing or incomplete data sets, data alteration, data organization – Forging (inventing), cooking (retain only those results that support the hypothesis), trimming (unreasonable smoothing) – Amending: instrument malfunctions, changes in samples, deviations in the procedures 8/8/2015RCR - data management24
DATA SHARING Data sharing is the way in which research is accurately represented to the scientific community and the general public. Data sharing during the project should be done with care, as the interpretations may be impacted by later findings. Some sponsor institutions and some funding agencies have their own data sharing requirements. 8/8/2015RCR - data management25
Data sharing and reporting data sharing prior to publication: implications of a data set may not be knowns; shared results might be used for individual gain; there may be immediate benefits Data sharing after publication: the data is now open; sharing of raw data is reviewable Obligation to report: some funding agencies have stipulations on reporting; Patriot Act, Freedom of Information Act Data is shared to acknowledge a project’s implications, contribute to a field of study and stimulate new ideas. NIH endorses sharing of final research data; expects and supports timely release and sharing of research data from NIH-sponsored studies for use by other researchers. 8/8/2015RCR - data management26
TEAM RESPONSIBILITIES Each member of a research team has a different role and responsibilities. These (roles and responsibilities) should be understood by all team members. Teams often include: PI (enables the project), director (controls the project), research associate (coordinates the project), research assistant (carries out the work), and support (statistician, …) 8/8/2015RCR - data management27
8/8/2015RCR - data management28
8/8/2015RCR - data management29
8/8/2015RCR - data management30
8/8/2015RCR - data management31
8/8/2015RCR - data management32
WHAT ARE DATA? 8/8/2015RCR - data management33
DATA OWNERSHIP 8/8/2015RCR - data management34
DATA COLLECTION 8/8/2015RCR - data management35
COLLECTING VALID DATA 8/8/2015RCR - data management36
DATA PROTECTION 8/8/2015RCR - data management37
DATA SHARING 8/8/2015RCR - data management38
RESEARCH TEAM RESPONSIBILITIES 8/8/2015RCR - data management39