Data Governance Data & Metadata Standards Antonio Amorin © 2011
Abstract This data governance presentation focuses on data and metadata standards. The intention of the presentation is to identify new standards or modernize existing standards for both data and metadata. © 2011
Biography Antonio Amorin President, Data Innovations, Inc. –Nineteen years of data modeling experience –Eleven years of data profiling experience –Delivered data modeling and data profiling solutions to numerous clients in the Midwest and East Coast –Presented at national and international conferences, user groups, webcasts, and at client sites –Founded Data Innovations, Inc. in 2002 © 2011
Data Innovations, Inc. Established in 2002 Based in northwest suburbs Professional Services: –Data Modeling –Data Profiling –Data Architecture –Metadata –Database Administration –ETL CA Service Partner in 2004 CA Commercial Reseller in 2006 CA Enterprise Solution Provider in 2007 © 2011
Agenda Data Standards Metadata Standards Recommendations Summary © 2011
Data Standards Documented agreements on representations, formats, and definitions of business data © 2011
Data Standards Benefits –Improved data quality –Improved data compatibility –Improved consistency and efficiency of data collection, use, and sharing –Reduced data redundancy © 2011
Data Standards Data Stewards –Role or position –Responsible for overseeing stewardship of the data and metadata –Likely to be on both the business and IT sides of the organization –Gatekeepers © 2011
Data Standards Council or Board –Data stewards and representatives of the various business areas –Responsible and/or accountable for specific data for the organization © 2011
Data Standards Types of Standards –Data definitions –Data rules –Data values –Data quality –Data standardization –Data security © 2011
Data Standards Data Definitions and Rules –Provide a consistent, clear understanding of what data content is expected –Centralize or publish across the organization –Enterprise data dictionary or metadata repository © 2011
Data Standards Data Values –Valid values lists Static or rarely changed data Codes Indicators –Master reference data Customer Product Etc –Centralize © 2011
Data Standards Data Quality –Leverage data profiling Column/Field –Value analysis –Pattern analysis –Data type analysis Table/File –Validate key structure –Determine dependencies Cross-table –Validate foreign keys –Valid values Cross-system © 2011
Data Standards Data Quality Assessments –Standardize the process through detailed analysis procedures –Identify the different data quality problems using standardized notation –Summarize the analysis in reports to communicate to others –Create detailed examples to coincide with the analysis procedures © 2011
Data Standards Data Standardization –Address Leverage address standardization software –Phone and Leverage data quality software to standardize –Business data Leverage valid values and master reference data to standardize data across the organization © 2011
Data Standards Data Security –Identify sensitive data –Clearly define and publish procedure for requesting access –Identify and maintain lists of users with access rights –Validate regularly that the user still needs access © 2011
Metadata Standards Documented agreements on representations, formats, and definitions of Metadata © 2011
Metadata Standards Metadata Stewards –Generally IT resources fill this role or position –Responsible for overseeing stewardship of the metadata –Standards are generally integrated into the SDLC © 2011
Metadata Standards Metadata Stewards –Generally IT resources fill this role or position –Responsible for overseeing stewardship of the metadata –Standards are generally integrated into the SDLC © 2011
Metadata Categories © 2011
Model Metadata Business metadata –Business requirements –Functional requirements –Data requirements Data profiling metadata –Column profiling –Table profiling –Cross-table profiling –Cross-system profiling Data quality metadata –Data quality statistics Data modeling metadata –Enterprise data models –Logical models –Physical models Mapping metadata –Source-to-target mapping –Data Flow Diagrams Database metadata –Data Definition Language © 2011
Model Metadata Business metadata –Business requirements –Functional requirements –Data requirements Data profiling metadata –Column profiling –Table profiling –Cross-table profiling –Cross-system profiling Data quality metadata –Data quality statistics Data modeling metadata –Enterprise data models –Logical models –Physical models Mapping metadata –Source-to-target mapping –Data Flow Diagrams Database metadata –Data Definition Language © 2011
Metadata Standards Data Requirements –Align with the business requirements –Each business requirement is likely to have matching data requirements –Clearly define the data content to be captured –Profile existing data sources © 2011
Metadata Standards Data Profiling –Identify standards for utilization Create a step-by-step process for preparing the data, profiling the data, and analyzing the results Identify and document the communication method to the business and IT © 2011
Metadata Standards Data Profiling –Column Profiling Identify both valid and invalid –Values –Patterns –Data types –Lengths Standardize notation –Descriptions –Problems © 2011
Metadata Standards Data Profiling –Table Profiling Validate key structure Identify candidate keys Identify natural keys Identify and document exceptions or violations –Cross-Table Profiling Identify redundant data Validate foreign keys Identify orphaned rows © 2011
Metadata Standards Data Profiling –Table Profiling Validate key structure Identify candidate keys Identify natural keys Identify and document exceptions or violations –Cross-Table Profiling Identify redundant data Validate foreign keys Identify orphaned rows © 2011
Metadata Standards Data Profiling –Cross-system Profiling Identify redundant data Identify inconsistent data Identify common matching criteria © 2011
Metadata Standards Data Quality –Consider requiring as part of all profiling initiatives –Capture and store in metadata repository –Establish thresholds –Trend monitoring © 2011
Metadata Standards Data Modeling –Enterprise Data Model Identify high level view of where the data lives across the enterprise Centralize to make accessible across the organization Consider identifying enterprise-level entities for important data © 2011
Metadata Standards Data Modeling –Model Standards Standardized development process Model naming convention Name standards Data type standards Clearly documented review process © 2011
Metadata Standards Data Modeling –Logical/Physical Models Standards Model or project narrative Subject area Entity Relationships Attribute Identifier Derived and BI Elements © 2011
Metadata Standards Data Modeling –Metadata Validation Column level –Values –Patterns –Data types –Lengths Table level –Key validation Cross-table level –Foreign key relationships © 2011
Metadata Standards Mapping –Standardize mapping process –Standardize format of mapping document –Require data profiling as part of the mapping process or to validate mapping © 2011
Recommendations Publish or centralize data and metadata standards Integrate data and metadata standards into the SDLC Include standards review during onboarding Identify and publish the list of stewards Enforce standards with offshore teams © 2011
Summary Data and metadata standards need to be developed and supported by both IT and the business Well defined standards will enhance the development of new applications and simplify the integration of data across the organization © 2011
Questions ? © 2011
Thank You! Antonio C. Amorin –(847) Data Innovations, Inc. – –(888) © 2011