Download presentation
Presentation is loading. Please wait.
Published byStephan Fried Modified over 6 years ago
1
Module P4 Identify Data Products and Views So Their Requirements and Attributes Can Be Controlled Learning Objectives: Understand the value of data. Understand the importance of consistent data description Establishment of relevant attributes, and unique identification Learning Outcomes – Students will: Be able to explain the value of data Be introduced to various tools that ensure consistent data descriptions Have a list of relevant attributes for unique identification. Presentation: display all content on entering screen. Explanatory material (voiceover, popup, etc.) Data is of value to the enterprise when it can be located or accessed by users. Metadata, or data about data, is essential for data managers and others to identify, catalog, store, search for, locate, and retrieve data Creating standard processes for selecting metadata provides for consistent, uniform, repeatable processes that can be tailored to specific business requirements. Unique identifiers, usually called “keys” in database terminology, are the attributes (e.g., document number and version number) that make it possible to unambiguously distinguish one product from another. Not all data is delivered as a data product though; if anything, the trend is away from delivery and toward access as needed. When access is provided for, an authorized user can retrieve data that has been grouped or organized to meet specific needs—what is referred to in this standard as a “data view.” The purpose of this principle is to ensure that metadata is selected to enable effective identification, storage, and retrieval of data. References: in GEIA 859 Principle 4
2
Process for Consistently Describing Data
Enabler 4.1: Develop Consistent Methods for Describing Data Process for Consistently Describing Data Presentation: display all content on entering screen. Explanatory material (voiceover, popup, etc.) The process for selecting metadata should be coordinated with users or other enterprises to ensure compatibility and interoperability among those who will exchange data. Attributes are the properties that uniquely characterize the data, such as document number, title, date and data type. A metadata record consists of a set of attributes necessary to describe the data in question. Although identification of attributes initially occurs during the early stages of planning, it should be seen as an iterative process throughout the data life cycle. Business rules are needed to consistently describe data throughout the life cycle. This will include an attribute glossary, controlled vocabulary (enterprise level) that is not project specific. Processes should be developed to map the flow of data throughout the life cycle. The use of a template provides a consistent, repeatable method to identify data products and the flow of data among users. When developing a process for describing data it is important to apply a methodology to characterize data and data products to ensure adequacy and consistency. References: in GEIA 859 Principle 4
3
Process for Consistently Describing Data
Enabler 4.1: Develop Consistent Methods for Describing Data Process for Consistently Describing Data Presentation: display all content on entering screen. Explanatory material (voiceover, popup, etc. Following paragraphs follow boxes that are highlighted) To ensure a consistent method for describing data, the enterprise must develop business rules to be used to catalog data products, views, and other data to be supplied. This can be done by leveraging the products of the DM architecture development process; review of the workflows, applications, and performers will identify some of the users of the metadata. For example, a workflow for program management will show which data products will support which management processes, including decision processes. In addition, the processes used to develop the DM architecture provide a template for development of metadata. Developing effective metadata requires a team with diverse skills to create a glossary: Personnel with library science skills will have knowledge of data and information retrieval techniques. The library staff will be especially knowledgeable about information-seeking behavior and about indexing and retrieval techniques for information (vs. structured data). They will also help with the development of cataloging rules and a taxonomy, if one is needed. Information systems personnel will understand the technical capabilities of systems to store, index, retrieve, and link data products. Other stakeholders, such as the configuration management staff, will provide a different perspective on working data, released data, submitted data, and approved data. When the required metadata attributes have been defined, the feasibility (both technical and financial) of capturing and maintaining all the metadata should be assessed, and the metadata solution defined. Then the specific processes for metadata capture and maintenance can be designed. For example, determining what constitutes the “right” content for an attribute may reveal a need for a controlled vocabulary for consistent cataloging of data products. A cataloging process must be designed that incorporates use of the controlled vocabulary in the step where data products are tagged. The process to maintain the vocabulary must also be designed; it may mirror the cross-functional team and be facilitated by the library team members. The supplier environment may be multitiered (a prime contractor and often several tiers of contractors). The customer/supplier relationship then exists at several levels in this hierarchy of suppliers. In this case, the team developing metadata processes, with associated rules (such as a controlled vocabulary), must work with a larger perspective, and the cross-functional team will include representatives from the trading partners’ program staff, IT staff, etc. It may be necessary, for example, to map between two controlled vocabularies or to develop an agreed-upon process for metadata assignment. References: in GEIA 859 Principle 4
4
Process for Consistently Describing Data
Enabler 4.1: Develop Consistent Methods for Describing Data Process for Consistently Describing Data Presentation: display all content on entering screen. Explanatory material (voiceover, popup, etc. Following paragraphs follow boxes that are highlighted) No popup for “contact Stakeholders” Although it is desirable to standardize attributes, it may be expensive to do so if existing data systems must be modified. An alternative is for each team member to map to a neutral standard. In any event, standards invoked by a customer should be flowed down to team members and understood by all parties. Use of standards, such as EIA-836, Configuration Management Data Exchange and Interoperability, and the Universal Data Element Framework (UDEF), enhances the ability to exchange data. Ref ANSI-859 section 4.1.1 No popup for “Select compatible Attributes” Processes should be developed to map the flow of data throughout the life cycle. The use of a template provides a consistent, repeatable method to identify data products and the flow of data among users. Use of templates helps ensure consistency across the enterprise in defining data products. Data owners and users are identified in the process, along with any requirements associated with metadata. A template, for instance, could help identify commonly needed fields for any product, the associated metadata, and valid entries for the data. Ref ANSI-859 section 4.1.2 References: in GEIA 859 Principle 4
5
Process for Consistently Describing Data
Enabler 4.1: Develop Consistent Methods for Describing Data Process for Consistently Describing Data Presentation: display all content on entering screen. Explanatory material (voiceover, popup, etc. Following paragraphs follow boxes that are highlighted) Once processes are developed and tested, users should be trained in using the templates to identify the data products. Users should be provided with the templates along with instructions for use and possible tailoring. The purpose, expected results, and any ground rules should be identified to assist users with accomplishing their goals. Consistent use of the templates helps in the exchange of data among users. Ref ANSI-859 section 4.1.2 When selecting metadata attributes, the enterprise should identify team members who potentially create data, update data, exchange data, enter data into a repository, or search for data. Team members should be contacted to obtain input and coordinate requirements. Although it is desirable to standardize attributes, it may be expensive to do so if existing data systems must be modified. An alternative is for each team member to map to a neutral standard. Ref ANSI-859 section 4.1.1 Standards invoked by a customer should be flowed down to team members and understood by all parties. Use of standards, such as EIA-836, Configuration Management Data Exchange and Interoperability, and the Universal Data Element Framework (UDEF), enhances the ability to exchange data. Ref ANSI-859 section 4.1.1 References: in GEIA 859 Principle 4
6
Enabler 4.2: Establish Relevant Attributes to Refer to and Define Data
Develop a Process for Selecting Attributes Presentation: Each box in the flowchart corresponds in order to the bullets in the explanatory material. Explanatory material (voiceover, popup, etc.) Cataloging, storing, and retrieving data depend on understanding the format of the data to be managed. Electronic files are managed differently than hard-copy paper or microfilm, so the physical characteristics should be considered when establishing attributes The storage medium and file formats influence readability and reproducibility of the content. Access to data is restricted based on proprietary issues, security issues, or other limits in data rights. Thus, part of what is involved in selecting attributes is determining what attributes are needed to identify data that requires special handling or limited access. Requirements for tracking and reporting metrics also should be considered when selecting attributes. Metrics are typically used to monitor throughput and ensure that the process is operating as intended, or to ensure that resources are properly allocated. The enterprise should identify relationships and their importance relative to other data elements in order to efficiently identify and manage related objects. Metadata attributes change over time due to evolving requirements throughout the life cycle. Potential attributes should be evaluated based on whether there is value added in tracking and locating data. It is important to weigh the cost of creating and entering metadata attributes, as well as the potential benefits. References: in GEIA 859 Principle 4, enabler 4.2
7
Enabler 4.2: Establish Relevant Attributes to Refer to and Define Data
Develop a Process for Selecting Attributes Presentation: Each box in the flowchart corresponds in order to the bullets in the explanatory material. Explanatory material (voiceover, popup, etc. Following paragraphs follow boxes that are highlighted) Characteristics of Data Electronic files DVD CD Tapes Hard-copy paper Microfilm Velum Others particular to the project Identify storage and retrieval methods What is needed to read the data? Are data views required? What is needed to reproduce the data? What attributes are specific for each format type? How often does this specific format need to be assessed due to its technology? Data must be marked with appropriate classification, distribution statements, and other required markings, regardless of format or method of access. Metadata regarding access rights and controls must be defined, and processes must be in place to ensure accurate tagging of data products to prevent inadvertent inappropriate disclosure. Identify requirements for access restrictions (information and media security are of the highest importance) Are the data proprietary, limited, or otherwise identified with security markings? Do the data require special handling or restrictive data views? Are the security requirements of both the senders and receivers known and met? Specify the parameters placed on the metadata attributes Are there contractual requirements, such as a unique identifier on any data product that documents performance, financial status, etc.? Are there enterprise requirements? Are there configuration management requirements, such as configuration identification and control? Are there field size, format, or other restrictions due to database compatibility? References: in GEIA 859 Principle 4, enabler 4.2
8
Enabler 4.2: Establish Relevant Attributes to Refer to and Define Data
Develop a Process for Selecting Attributes Presentation: Each box in the flowchart corresponds in order to the bullets in the explanatory material. Explanatory material (voiceover, popup, etc. Following paragraphs follow boxes that are highlighted) Identify the metrics to be tracked What information do the decision makers require? Should throughput of the data process be monitored? How will the success of the process be analyzed? What information will aid in proper resource allocation? Identify the relationships among the data elements Is there a hierarchy? Are there dependencies among attributes? Would a taxonomy structure be helpful? It is important to weigh the cost of creating and entering metadata attributes, as well as the potential benefits. If users are required to complete numerous metadata entries when placing a document in a repository, it is likely that documents will be entered with missing or erroneous entries, or that documents will not be entered into the repository at all. Potential attributes should be evaluated based on whether there is value added in tracking and locating data. The set of required attributes should be kept as small and simple as possible to enable a user to create simple descriptive records and provide for effective retrieval. Any existing metadata standards should be tailored to meet needs. Ref. ANSI Metadata attributes change over time due to evolving requirements throughout the life cycle. These changes include changes to the data repository (e.g., facility or system upgrades) as well as obsolescence. Part of the overall DM process includes periodic reviews of metadata attributes. When making changes to attributes, the enterprise should consider the impact on legacy data. In a large repository, it may not be feasible to update the metadata of existing data, and it may be necessary to develop translation tables or similar mechanisms. Ref. ANSI References: in GEIA 859 Principle 4, enabler 4.2
9
Enabler 4.3: Assign Identifying Information to Distinguish Similar or Related Data Products from Each Other Assign Identifying Information to Distinguish Among Similar Data Products Presentation:Each box in the flowchart corresponds in order to the bullets in the explanatory material. Explanatory material (voiceover, popup, etc.) Data must be assigned unique identifying information, which commonly consists of a title, unique identifier (e.g., document number), the source of the document, date, and revision. The enterprise should ensure that a unique identifier is needed. Unique identifiers are assigned only to the data that needs to be tracked and controlled to meet ongoing needs for the data. The identifier provides a method for differentiating among similar documents and enables consumers to identify the information they need to perform their assigned tasks. It also helps to minimize the delay in retrieving the desired information, and the problems caused by the use of incorrect information. Application: IUID Item unique identification example. References: in GEIA 859 Principle 4, enabler 4.3
10
Identify Source for Identifiers
Data must be assigned unique identifying information, which commonly consists of a title, unique identifier (e.g., document number), the source of the document, date, and revision. Voiceover or popup: Could be combined with previous slide. The requirements for document identification are discussed in EIA-649, National Consensus Standard for Configuration Management, and EIA-836, Configuration Management Data Exchange and Interoperability. The enterprise should ensure that a unique identifier is needed. Unique identifiers are assigned only to the data that needs to be tracked and controlled to meet ongoing needs for the data. The identifier provides a method for differentiating among similar documents and enables consumers to identify the information they need to perform their assigned tasks. It also helps to minimize the delay in retrieving the desired information, and the problems caused by the use of incorrect information. Ref ANSI-859 section 4.3 Internally Developed Externally Assigned
11
c. Industry computing solution d. b. and c. What is Metadata?
a. Large Data b. Data about data c. Industry computing solution d. b. and c. 2. ________ are needed to consistently describe data throughout the life cycle. a. complex computer systems b. business rules c. flow charts d. project specific vocabulary e. all of the above 3. True/False All data, regardless of physical attributes, can be catalogued, stored, and retrieved in the same manner. (False, cataloguing, storing and retrieving data depend on understanding the format of the data to be managed (e.g. hard copy vs. electronic) 4. Data must be assigned unique identifying information, which commonly consists of___________ a. title b. unique identifier (e.g., document number) c. the source of the document d. date, and revision e. All the above. What is Metadata? a. Large Data b. Data about data c. Industry computing solution d. b. and c. 2. ________ are needed to consistently describe data throughout the life cycle. a. complex computer systems b. business rules c. flow charts d. project specific vocabulary e. all of the above 3. True/False All data, regardless of physical attributes, can be catalogued, stored, and retrieved in the same manner. (False, cataloguing, storing and retrieving data depend on understanding the format of the data to be managed (e.g. hard copy vs. electronic) 4. Data must be assigned unique identifying information, which commonly consists of___________ a. title b. unique identifier (e.g., document number) c. the source of the document d. date, and revision e. All the above.
12
(True or False) Potential attributes should be evaluated based on whether there is value added in tracking and locating data. When selecting metadata attributes, the enterprise should identify team members who potentially_____________. Create data Update data Exchange data Enter data into a repository Search for data All the above.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.