Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Quality Class 4. Goals Questions Review of SQL select Data Quality Rules.

Similar presentations


Presentation on theme: "Data Quality Class 4. Goals Questions Review of SQL select Data Quality Rules."— Presentation transcript:

1 Data Quality Class 4

2 Goals Questions Review of SQL select Data Quality Rules

3 SQL Structured Query Language Used to extract data from databases Used to insert data into a database

4 The Select Statement select [all | distinct] from [ | ] [,[ | ]...] [where ] [group by [, ]...] [having ] [order by { | } [asc | desc] [,{ | } [asc | desc]]...]

5 Data Quality Rules Definitions Proscriptive Assertions Prescriptive Assertions Conditional Assertions Operational Assertions

6 Definitions Nulls Domains Mappings

7 Proscriptive Assertions Describe what is not allowed Used to figure out what is wrong with data Used for validation

8 Prescriptive Assertions Describe what is supposed to happen with data Can be used for data population, extraction, transformation Can also be used for validation

9 Conditional Assertions Define an assertion that must be true if a condition is true

10 Operational Assertions Define an action that must be taken if a condition is true

11 9 Classes of Rules 1. Null value rules 2. Value rules 3. Domain membership rules 4. Domain Mappings 5. Relation rules 6. Table, Cross-table, and Cross-message assertions 7. In-Process directives 8. Operational Directives 9. Other rules

12 Null Value Rules Null value specification – Define GETDATE for unavailable as “fill in date” Null values allowed – Attribute A allowed nulls {GETDATE, U, X} Null values not allowed – Attribute B nulls not allowed

13 Value Rules Value restriction rule Restrict GRADE: value >= ‘A’ AND value <= ‘F’ AND value != ‘E’

14 Domain Rules Domain Definition Domain Membership Domain Nonmembership Domain Assignment

15 Mapping Rules Mapping definition Mapping membership Mapping nonmembership Mapping Assignment

16 Relation Rules Completeness Exemption Consistency Derivation

17 Completeness Defines when a record is complete (I.e., what fields must be present) IF (Orders.Total > 0.0), Complete With {Orders.Billing_Street, Orders.Billing_City, Orders.Billing_State, Orders.Billing_ZIP}

18 Exemption Defines which fields may be missing IF (Orders.Item_Class != “CLOTHING”) Exempt {Orders.Color, Orders.Size }

19 Consistency Define a relationship between attributes based on field content – IF (Employees.title == “Staff Member”) Then (Employees.Salary >= 20000 AND Employees.Salary < 30000)

20 Derivation Prescriptive form of consistency rule Details how one attribute’s value is determined based on other attributes IF (Orders.NumberOrdered > 0) Then { Orders.Total = (Orders.NumberOrdered * Orders.Price) * 1.05 }

21 Table and Cross-Table Rules Functional Dependence Primary Key Assertion Foreign Key Assertion (=referential integrity)

22 Functional Dependence Functional Dependence between columns X and Y: – For any two records R1 and R2 in a table, if field X of record R1 contains value x and field X of record R2 contains the same value x, then if field Y of record R1 contains the value y, then field Y of record R2 must contain the value y. In other words, attribute Y is said to be determined by attribute X.

23 Primary Key Assertion A set of attributes defined as a primary key must uniquely identify a record Enforcement = testing for duplicates across defined key set

24 Foreign Key Assertion When the values in field f in table T is chosen from the key values in field g in table S, field S.g is said to be a foreign key for field T.f If f is a foreign key, the key must exist in table S, column g (=referential integrity)

25 In-process Directives Definition directives (labeling information chain members) Measurement directives Trigger directives

26 Operational Directives Transformation Update

27 Other Rules Approximate Searching rules Approximate Matching rules


Download ppt "Data Quality Class 4. Goals Questions Review of SQL select Data Quality Rules."

Similar presentations


Ads by Google