Data Management Scope and Strategies

Slides:



Advertisements
Similar presentations
Module N° 4 – ICAO SSP framework
Advertisements

1 Auditing in the Public Interest Records Management in the Victorian Public Sector Audit objective Audit had two objectives : The first objective was.
AUDIT COMMITTEE FORUM TM ACF Roundtable IT Governance – what does it mean to you as an audit committee member July 2010 The AUDIT COMMITTEE FORUM TM is.
Data Management Awareness January 23, University of Michigan Administrative Information Services Data Management Awareness Unit Liaisons January.
SAFA- IFAC Regional SMP Forum
Purpose of the Standards
Session 6: Data Integrity and Inspection of e-Clinical Computerized Systems May 15, 2011 | Beijing, China Kim Nitahara Principal Consultant and CEO META.
Control environment and control activities. Day II Session III and IV.
Internal Auditing and Outsourcing
Presented to President’s Cabinet. INTERNAL CONTROLS are the integration of the activities, plans, attitudes, policies and efforts of the people of an.
Organize to improve Data Quality Data Quality?. © 2012 GS1 To fully exploit and utilize the data available, a strategic approach to data governance at.
IAEA International Atomic Energy Agency Reviewing Management System and the Interface with Nuclear Security (IRRS Modules 4 and 12) BASIC IRRS TRAINING.
Information Assurance The Coordinated Approach To Improving Enterprise Data Quality.
Quote for today “Sometimes the questions are complicated and the answers are simple” - ?? ????? “Sometimes the questions are complicated and the answers.
How Hospitals Protect Your Health Information. Your Health Information Privacy Rights You can ask to see or get a copy of your medical record and other.
Service Transition & Planning Service Validation & Testing
Certification and Accreditation CS Phase-1: Definition Atif Sultanuddin Raja Chawat Raja Chawat.
Information Systems Security Operational Control for Information Security.
1 Seminar on 2008 SNA Implementation June 2010, Saint John’s, Antigua and Barbuda GULAB SINGH UN Statistics Division Diagnostic Framework: National.
Strengthening Science Supporting Fishery Management  Standards for Best Available Science  Implementation of OMB’s Peer Review Bulletin  Separation.
M u l t I b e a m III W o r k s h o p M u l t I b e a m III W o r k s h o p National Geophysical Data Center / World Data Centers NOAA Slide 1 End-to-End.
Geoffrey L. Beausoleil Assistant Manager, Office of Operational Support DOE Idaho Operations Office September 12, 2006 Presentation to DOE ISM Champions.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc., All Rights Reserved. 6-1 Chapter 6 CHAPTER 6 INTERNAL CONTROL IN A FINANCIAL STATEMENT AUDIT.
Data Governance 101. Agenda  Purpose  Presentation (Elijah J. Bell) Data Governance Data Policy Security Privacy Contracts  FERPA—The Law  Q & A.
Consultant Advance Research Team. Outline UNDERSTANDING M&E DATA NEEDS PEOPLE, PARTNERSHIP AND PLANNING 1.Organizational structures with HIV M&E functions.
Copyright © 2007 Pearson Education Canada 23-1 Chapter 23: Using Advanced Skills.
The Implementation of BPR Pertemuan 9 Matakuliah: M0734-Business Process Reenginering Tahun: 2010.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Purchasing Forum – May The integration of the activities, plans, attitudes, policies, and efforts of the people of an organization working together.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Company LOGO. Company LOGO PE, PMP, PgMP, PME, MCT, PRINCE2 Practitioner.
Session 6: Data Flow, Data Management, and Data Quality.
The NIST Special Publications for Security Management By: Waylon Coulter.
Organizations of all types and sizes face a range of risks that can affect the achievement of their objectives. Organization's activities Strategic initiatives.
INFORMATION ASSURANCE POLICY. Information Assurance Information operations that protect and defend information and information systems by ensuring their.
Data Management Scope and Strategies K.L. Sender and J.L. Pappas Information and Technical Services National Marine Fisheries Service Southwest Fisheries.
COBIT. The Control Objectives for Information and related Technology (COBIT) A set of best practices (framework) for information technology (IT) management.
Security Methods and Practice Principles of Information Security, Fourth Edition CET4884 Planning for Security Ch5 Part I.
Chapter 6 Internal Control in a Financial Statement Audit McGraw-Hill/IrwinCopyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
© 2016 Chapter 6 Data Management Health Information Management Technology: An Applied Approach.
SUNY Maritime Internal Control Program. New York State Internal Control Act of 1987 Establish and maintain guidelines for a system of internal controls.
Stages of Research and Development
JMFIP Financial Management Conference
Auditing Concepts.
Modern Systems Analysis and Design Third Edition
Software Configuration Management (SCM)
Configuration Management
Software Project Configuration Management
MANAGEMENT of INFORMATION SECURITY, Fifth Edition
PowerPoint to accompany:
Impact-Oriented Project Planning
Principles of Information Systems Eighth Edition
Configuration Management
The ISSAIs for Financial Audit ISSAIs
TechStambha PMP Certification Training
Agency Requirements: NOAA Administrative Order Management of environmental and geospatial data and information This training module is part of.
Setting Actuarial Standards
Monitoring and Evaluation using the
UNLV Data Governance Executive Sponsors Meeting
Alignment of COBIT to Botswana IT Audit Methodology
By Jeff Burklo, Director
Systems Analysis and Design
EMS Checklist (ISO model)
Portfolio, Programme and Project
Taking the STANDARDS Seriously
Internal Control Internal control is the process designed and affected by owners, management, and other personnel. It is implemented to address business.
Introduce myself & around table
ISSUE MANAGEMENT PROCESS MONTH DAY, YEAR
HUD’s Coordinated Entry Data & Management Guide
OBSERVER DATA MANAGEMENT PRINCIPLES AND BEST PRACTICE (Agenda Item 4)
Presentation transcript:

Data Management Scope and Strategies K.L. Sender and J.L. Pappas Information and Technical Services National Marine Fisheries Service Southwest Fisheries Science Center Honolulu Laboratory For those of you that haven’t met me before, my name is Karen Sender. I have been working at the Honolulu Lab as a JIMAR employee for a little over one year as an application developer in the Information and Technical Services group. For most of the past 25 years, I worked in marine geology and geophysics at the Hawaii Institute of Geophysics, first as the scientific lab officer onboard UH’s research vessel and the data manager for their ocean-bottom seismometer program. Subsequently, I was a principal in the development of the University’s seafloor mapping project, serving as data manager and data processing systems developer. My responsibilities included all aspects of acquisition, processing, analysis, imaging, archiving and documentation of large, complex data sets. Over the past several months, Jan Pappas and I have spent much of our time working with the Longline Observer data, trying, primarily, to resolve issues with data transfer from PIAO to HL, but in the process to assess data integrity. Our findings on that were presented to PIAO and interested HL data users last week but in a broad sense, most of the issues we were concerned with seemed to be the result of a lack good data management practices. These issues were also not uncommon with other data sets at the Lab and eventually we were asking ourselves with increasing frustration why these mistakes get repeated. When news came our way about the new Bottom Fish Observer Program, we started talking about ways to design data sets that would result in maximum data quality with the least amount of wasted effort. These talks resulted in our drafting a Data System Design Protocol that outlined procedures, roles and responsibilities for designing and maintaining a quality data set. While Jan and I were, we think, justifiably pleased with this document, further discussion led us to the conclusion that we have no way of implementing it. No structure or mechanism exists wherein it can be proposed or adopted as a laboratory standard. Clearly, what is needed within the Laboratory is a better understanding of the scope and importance of good data management and, just as importantly, a cohesive set of policies and guidelines on how data are collected and managed within our organization. This presentation is our attempt to address those issues. Let us first begin by reviewing why we care about data management.

Data Is the Foundation on Which We Build Success. Within NMFS we are working in an environment in which our success is measured not only by our ability to provide informed decisions on managing our natural marine resources, but also by our ability to defend the science on which we base those decisions. Successful management of data is critical to our mission to make informed decisions.

The quality of the science can be only as good as the data it was based upon.

What Is the Goal of Data Management? The most important goal of Data Management is to ensure access and dissemination of quality data to appropriate end-users in a timely manner. What is the goal of data management? …to provide quality data to appropriate users in a timely manner.

Quality Data Is the Result of Good Data Management From data program conception through the lifecycle of the data, good data management requires consistent, well thought-out and universally supported procedures and guidelines for its collection, maintenance and dissemination.

Why Do We Want to Manage Data? To fulfill the agency’s mission to conserve and manage the nation’s coastal and marine resources to ensure sustainable economic opportunities. Why do we want to manage data? Our gut answer should unanimously be, “because we want the best data.” But even if that were not true, we are obligated to manage our data to fulfill the agency’s mission.

What Is Data Management? Data Management is the vast array of tasks that begin before data collection and continue throughout the lifecycle of the data. Before the rise of the modern scientific enterprise, data would commonly be collected and processed by an individual or a project, and various levels of raw, processed or final data were possibly turned over to a data repository for inventory and archiving. Lack of data management standards might cause headaches for those generating and processing the data, but often the only noticeable results would be delays in the final product. In today’s scientific enterprise, data must be shared, sometimes in near real time, in order to maximize the value of the data. Management of data in this type of environment requires a broader understanding of the processes through which data are collected and maintained within the centralized data resource. Documentation, or metadata, must be complete and accessible. Users must be able to rely on the quality of the data they obtain from sources not their own. It’s Not Just Archiving!

Data Management Includes Data Resource Development. The data resource is the centralized data repository used for storage and access of scientific enterprise data. Data management tasks for developing and maintaining the data resource include…

· What data need to be shared? Identification of data to be included in the data resource ·       What data need to be shared? ·       Will sharing a data set increase its value and the value of an existing data set?   Defining business rules for the data elements ·       What are the valid ranges of a given data element? ·       What does a null value mean versus a zero in a data field? Designing the data models ·       Is the data model compatible with other data sets with which it needs to be integrated? Designing data element naming/format standards ·       Can users easily understand what the data fields represent? ·       (6 names for temperature and multiple units of measure) Complying with hardware/software standards ·       Are we using agency-approved hardware and software tools? ·       Are our current tools maximizing data quality and facilitating data access? Performing routine and disaster recovery system administration Are we performing both preventive maintenance and planning for catastrophic failures?

Data Management Includes Data Collection. Data management includes all aspects of data design and collection…

Identifying not only the data elements desired but also any additional information that will be required to make use of those data. ·       Do data records require unique identifiers? ·       Do any reference tables need to be designed or can existing ones be used? Developing clear, concise manuals and instructions for data collection. Are the instructions for data fields unambiguous? Designing user-friendly paper and electronic forms for data capture and entry ·       Is data quality being compromised at data collection because the forms are too complicated or difficult to fill out? Building data validation schemes in data entry applications ·       Are unreasonable data excluded from the data set at data entry? ·       Do you ever want to see north longitudes in your data?  Maintaining data set documentation and history Can you reproduce the path of a piece of data from data collection through data extraction should you be required by some legal review process?

Data Management Includes Data Maintenance. Data Maintenance is an ongoing process that ensures the highest quality data are available at all times. It includes…

Storing data in the centralized data resource ·       Are data-transfer routines automated and fully documented? ·       Have they been proved not to introduce errors or modify the data? Detecting and reporting errors via data monitoring applications ·       Do you want to have your users catch the errors before you do? Maintaining a history of when and by whom data are added, modified or deleted ·       Not only is this required for data security considerations, it provides information that can be used for process improvement. ·       How quickly is the data set growing? ·       What are the most common errors that are edited post data entry and can we eliminate them? Periodically auditing data by tracking data from collection to dissemination via the data path. ·       Are we ensuring that data are not lost or contaminated? ·       The first time your data are audited, do you really want it to be by an outside auditor? Developing and following formal change control procedures ·       Are all changes documented? ·       Have the changes been tested in a development environment? ·       Has proper notice been provided to all role groups? Developing and documenting data processing procedures ·       Can you fully document the path of your data?   Developing and testing applications before release ·       Can data downtime and user frustration be minimized?

Data Management Includes Data Dissemination. Data management includes those tasks related to data dissemination…

Providing user support via training and problem resolution Are the users provided with proper tools and information to allow access to the data and metadata?   Complying with data accessibility issues Have data access tools been reviewed and tested by an outside, independent reviewer? Ensuring data availability and ease of integration with other data sets, as needed (Whether this is within the lab or across the agency.) Establishing data security Are all appropriate users granted access in a timely manner? Is access revoked promptly when necessary? Complying with data publication formats and deadlines Do final data products such as maps, reports and web pages include standardized logs, scales and necessary metadata and disclaimers? Do dates clearly indicate the point in time when data was extracted?

Data Management is… Data management is…

…a lot more than we tend to think!

What Are the Costs of Poor Data Management? Misinterpretation of the data Lost data Inaccessible data Indefensible data Wasted time and money Missed deadlines Lost user confidence Any one of these can mean failure to a project!

What Are the Benefits of Good Data Management? Optimum data quality Improved user confidence Efficient and timely access to data Improved knowledge and understanding of the agency’s data holdings All of which should be our goal.

How Is Good Data Management Achieved? Through the development and implementation of well-conceived data management policy and data administration guidelines. Data management policy is a short, clearly written statement or outline of the organization’s philosophy, vision and goals for management of data. This policy applies to the entire organization, not just the IT/IM role groups and as such must be written in clear non-technical language. It should be inspiring, not threatening. The policy defines what you want everyone in the enterprise to accomplish. The actual methods for fulfilling the data management policy are defined in the Data Administration Guidelines. These guidelines define how you are going to implement the Data Management Policy. Where the Data Management policy might not change for a number of years, the Data Administration Guidelines will most likely be revised and refined on a regular basis.

Data Management Policy Policy for the management and protection of agency data. Set of broad, high-level principles forming a framework in which data management can operate efficiently and effectively. Data management policies exist for NOAA and NMFS and we are mandated to work within them. Why do we require a separate data management policy? Because creating our own defines and dictates our own attitudes and philosophies toward our data.

Data Management Policy Serves To… Ensure availability of stable, reliable and accessible collections of data in electronic form to all appropriate parties; Ensure compliance with all agency-wide mandates and directives. Improve direct access to data by the public and across the agency. Ensure availability of stable, reliable and accessible collections of data in electronic form to all appropriate parties; Ensure compliance with all agency-wide mandates and directives. Improve direct access to data by the public and across the agency.

Data Management Policy Allows… Good fisheries science. Good fisheries management Data users to be confident in their interpretations of quality data; The agency to properly defend its data in a court of law. Good fisheries science. Good fisheries management Data users to be confident in their interpretations of quality data; The agency to properly defend its data in a court of law.

Data Management Policy What might work… After reviewing NOAA and NMFS data management policies, along with innumerable others from both federal and state research institutions, Jan and I came up with what we think might be taken as a framework or at least a starting point for developing data management policy for the Laboratory.

A Draft Data Management Policy Programs that generate data will adhere to data management policies and guidelines. Data and metadata will be managed and stored in a centralized data resource. The data resource will be safeguarded and protected. and… 1. All functional units within the agency will comply with the data management policy. All outside organizations that collaborate with the agency on data will conform to the established data management policies and guidelines. (It is not enough to follow good data management practices ourselves if we then do not hold our data collaborators and contractors to the same standards.) 2. Database organization and structure will be planned on functional and agency levels. Data will be managed through the data stewardship principles of administering and controlling data quality and standards in support of agency goals and objectives. (Data stewardship implies a caretaker rather than an owner of the data.) Data will be protected from deliberate, unintentional or unauthorized alteration, destruction and/or inappropriate disclosure or use in accordance with agency policies and practices and federal and state laws. (An obvious and necessary consideration.)

A Draft Data Management Policy Data will be shared based on agency policies. Agency data will be cataloged and documented. Information quality will be actively managed throughout the life cycle of the data. 4. A particular individual, unit or group does not own agency data. The data will be made accessible to all authorized users in a timely manner, per agency policy and state and federal laws. 5. Standards will be developed for the representation of agency data and its metadata in the database. Business processes will be defined and documented. Controls will be established to assure the completeness and validity of the data and to manage redundancy. The agency data resource will be the officially recognized source for data reporting purposes. 6. Explicit criteria for data validity, availability, accessibility, understanding and ease of use will be established and promoted through data administration guidelines. An active program of process improvement will be applied to all data management polices, guidelines and protocols

Who Sets Data Management Policy and Guidelines? An information technology/information management steering committee, composed of one or more representatives from the key data management role groups: Administration Data generators Data users Information technology Management Data Management Steering Committee Ensures that data management policies are in line with those of NMFS, NOAA and DOC. Directs development, implementation and maintenance of detailed data policies, standards procedures and guidelines across the agency. Reports progress to the director on the performance achieved against the targets for improvement of data quality and the value gained from effective data management. Think of this as much more a “working group” than a committee.

Data Management Steering Committee Ensures that data management policies are in line with those of NMFS, NOAA and DOC. Directs development, implementation and maintenance of detailed data policies, standards procedures and guidelines across the agency. Reports progress to the director on the performance achieved against the targets for improvement of data quality and the value gained from effective data management. Ensures that data management policies are in line with those of NMFS, NOAA and DOC. Directs development, implementation and maintenance of detailed data policies, standards procedures and guidelines across the agency. Reports progress to the director on the performance achieved against the targets for improvement of data quality and the value gained from effective data management.

Data Management Roles and Responsibilities Clearly defined data management roles and responsibilities are required to execute the policies and guidelines of the agency.

All data management role groups must live within the policies and guidelines of DOC, NOAA and NMFS.

Data Ownership The agency is the owner of the data. The organizational unit or group that commissions the collection of a data set assigns a data steward to that data set. Agency data are not owned by a particular individual, unit or project but by the agency itself. Data Stewardship implies formal responsibility and accountability for management and quality of the data assigned to this role, in accordance with the defined data management policy The buck stops here!

Data Stewards set policy for the collection, management and accessibility of the data set and its metadata to ensure compliance with agency data management policies, mandates and relevant state/federal laws. Data Stewards Have planning and policy-level responsibility in their functional area. Document policies and procedures for access and use of the data set. Ensure that the data set is stored, managed and accessed in the enterprise database per the agreed upon data management policy. Periodically review costs and benefits of continuing to maintain the data set.

Data Managers Perform and supervise operational management of their data sets per the data management policy, data administration guidelines and the data set policies set by the data stewards. Have operational-level responsibility for data management activities related to data Capture Maintenance Dissemination

Database Administrator Has responsibility for the physical data resource: Generating physical database schema Performing database tuning Creating database backups Planning for database capacity Implementing data security requirements

Data Users are individuals requiring access to agency data in the course of meeting the requirements of their position and anyone in the public who wishes access to public information held in the data resource.

The Data Administrator, along with IT/IM steering committee, develops and maintains data administration guidelines and procedures. The Data Administrator facilitates the coordination between the data management role groups and provides guidance and training to comply with data policies and guidelines. Data Administrator Ensures that all data management role groups comply with data management policies and guidelines. Periodically reports to director on status of compliance with data management policies and guidelines.

Data Administration Guidelines Required for all data management task areas Data resource development Data collection Data maintenance Data dissemination The Data Administration Guidelines define how we accomplish our goals set in the Data Management Policy. Each task within the four data management areas must have clearly defined guidelines that, when followed, will ensure that the final information produced is of the highest quality. The guidelines will typically outline the roles and responsibilities of each member of the data management team. They most likely will state specifics on what hardware and software tools are acceptable to use. They should define what types of data should be stored in the enterprise database and specify the time frame in which that should happen. Some guidelines will probably refer to the development and implementation of very detailed procedures and protocols that must be followed, for example, the Data System Design Protocol that we referred to earlier. It is important to note again that where the Data Management Policy outlines high level principles and philosophy on managing data within our organization, the Data Administration Guidelines provide us with the specifics on how to achieve those goals.

The task areas in Data Management may seem daunting, and we do not mean to imply that most of these are not getting accomplished here at the lab. Unfortunately, some tasks are slipping through the cracks, while some are being performed redundantly or by the wrong role groups and others are only partially addressed. Here at the Lab we have great people with a lot of talent working in these areas, but we seem to have not completely adopted the concept of the scientific enterprise. What we really need to do is assess how we can redirect our skills and energies to accomplish all these tasks in an effective and efficient manner with the common goal of maximizing the value of our data.   Changes, however, need to be made, but in the field of Information Technology and Information Management, change is a way of life, especially in science and research. It is important to note that in today’s scientific enterprise, each one of these tasks cannot be executed in isolation from the rest. A data form cannot be designed without input from the database designer or even a publication editor. The data steward cannot decide to add additional fields to the data set without consulting with the person training the data collectors and the person writing the data collection manuals and probably every role group down the data chain.

Good data management is fundamental to our success Good data management is fundamental to our success. For some, it will require a new way of thinking: Enterprise Thinking. In the past, collecting an additional data set would, of course, be of great value to the lab. Now, adding that same data set to the data resource has the potential to significantly increase the value of multiple existing data sets if the data are properly managed.  

References Brohan, M., 2001, The Need for a Formal Data Management Policy, DM Review, v. May. http://www.dmreview.com/. Data Administration Forum, 1999, Data Management Roles and Responsibilities Guidelines, Ver. 1.3, Advisory Council for Information Management, British Columbia. Fisheries Information Technology for the 21st Century (FITS21). Flanagan, T., et al, 1998, A Practical Guide to Achieving Enterprise Data Quality. http://www.techquide.com. Imhoff, C., 1998, Ensuring Data Quality Through Data Stewardship, DM Review, v. Apr. http://www.dmreview.com/.

References (cont.) Imhoff, C., 1997, Data Stewardship: Finally a Process for Achieving Data Integrity, The Data Administration Newsletter. http://www.tdan.com. Information Resources Management Staff, Information Systems Office, Office of Finance and Administration, 2000, National Oceanic and Atmospheric Administration Strategic Information Technology Plan, FY 2001 – FY 2005. http://www.rdc.noaa.gov/~irm/index.html. Intra-governmental Group on Geographic Information, 2000, The Principles of Good Data Management. http://www.detr.gov.uk/. Sargent, J., Bistodeau, R. and Seem, D., 2000, NOAA Fisheries Information Technology (FIT) Architecture White Paper, Systems Development Methodologies. U.S. Fish and Wildlife Service, 2000, Data Standards. http://www.fws.gov/stand/. University of Michigan, 1994, Institutional Data Resource Management Policy.