Quality and Reliability of CRIS data A case for euroCRIS? euroCRIS Membership Meeting November 1 – 2, 2007, Vienna Maximilian Stempfhuber GESIS–IZ Social Science Information Centre Bonn, Germany
What to expect No Answers… …only Questions!
Current situation Data within a single CRIS is not up-to-date or correct Data harvested from different sources does not match Coupling of systems and data difficult because of different features, data structures / semantics, invalid references, … What more?
Data errors Single data source –Schema level Value out of range Referential integrity violated … –Data level Missing value Typing errors Wrong values Duplicates …
Data errors (cont.) Multiple data sources –Schema level Structural heterogeneity Semantic heterogeneity … –Data level Contradictory values Different representations Different level of aggregation Duplicates …
Quality of data When is an error an error? Who decides what is correct? How can we correct existing errors? How can we prevent future errors? What is Quality? How can we guarantee it in a CRIS?
What is Quality? Degree to which a set of inherent characteristics fulfills requirements (ISO 9000) Conformance to requirements (Philip B. Crosby) "Fitness for use". Fitness is defined by the customer. (Joseph M. Juran) The quality has two dimensions: "must-be quality" and "attractive quality“ (Noriaki Kano)
What is Quality? A quality is a characteristic that a product or service must have. For example, products must be reliable, useable, and repairable. These are some of the characteristics that a good quality product must have. Similarly, service should be courteous, efficient, and effective. These are some of the characteristics that a good quality service must have. In short, a quality is a desirable characteristic. …
What is Quality? (cont.) However, not all qualities are equal. Some are more important than others. The most important qualities are the ones that customers want. These are the qualities that products and services must have. …
What is Quality? (cont.) So providing quality products and services is all about meeting customer requirements. It's all about meeting the needs and expectations of customers. So a quality product or service is one that meets the needs and expectations of customers.
What is Quality? (cont.) The quality of a product or service refers to the perception of the degree to which the product or service meets the customer's expectations. Quality has no specific meaning unless related to a specific function and/or object. Quality is a perceptual, conditional and somewhat subjective attribute.
Information Quality IQ or data quality denotes the degree of relevance of information in relation to a specific context and information need. –Requirements may be user specific or very general –Total of all requirements towards information or information products ([information]process oriented view) –Information that is fit for use by information consumers (user oriented view)
Information Quality (cont.) Business oriented view: –Creating your own data and information: constructive information quality. –Getting data and information from external sources: receptive information quality
Criteria for IQ Eigenvalue Correctness, objectivity, trustablity, reputation Information context Relevance, added value, timeliness, completeness, amount of information View to information Interpretability, comprehensibility, free of manipulatoin, integrity, free of conflicts Information access Access to the system, Secure access (Wang & Strong)
Criteria for IQ (cont.) User-specific view: Degree of confidence in the correctness of the information Trustability of information on the basis of previous experiences Verifiability of information Precision of information Timeliness of information (Heinrich)
Criteria for IQ (cont.) For electronic media: Internal quality Precision, objectivity, trustability Quality of access Accessibility, Security Quality in context Meaning, added value, timeliness, completeness, information content Quality of display Interpretability, comprehensibility, compactness Quality of metadata (meta information) Existence, adequacy Quality of structure Existence, adequacy, traceability (Königer & Reithmayer)
Quality and CRISs User‘s view (determines categories for CRIS quality) Data producer’s view (initially creates information and (sometimes) has to maintain it) Data provider’s view (has to ensure information quality and quality of service)
Quality and CRISs (cont.) Roles: Data producers/researchers, CRIS/service providers, CRIS users IQ criteria: Precision, objectivity, trustability, timeliness, completeness, added value, accessibility, … Is it going beyond Code of Good Practice? Who is responsible for which quality criteria (in which phase)?
User‘s view Do we know the users‘ information needs (records, statistics,…)? Do we know of canonical needs (to specify pre-structured queries)? Do we know how information should be displayed, how it should be browsable, …? Do we know how information is used at the user‘s site (preferred formats, additional processing)?
CRIS provider’s view What scope and content should the CRIS have (= users‘ information needs)? How can we guarantee completeness How can we guarantee sustainability? How have quality criteria to be defined for local use of a CRIS? How for federated CRISs?
Data producer‘s view What support do I have in entering data? Who helps me in maintaining it? Can I reuse the data I entered in other contexts?
Questions to euroCRIS Do we have Use cases generally accepted? Common set of information quality criteria (beyond what is supported by database mechanisms and CERIF structure)? Do we need end-user testing? How can we establish IQ in the CRIS community? How can we share IQ with other actors?
23 Thank You! Dr. Maximilian Stempfhuber GESIS-IZ Social Science Information Centre Lennéstr. 30, Bonn, Germany