General Data Quality requirements (and metrics) CLARITY (understandable) All data elements have clear business meaning (name and definition) COMPLETENESS (mandatory attributes) All required business attribute values exist in record level FORMAT CONFORMANCY (right business format of attributes) Each attribute must have Right Data type, lenght, min max value, ... Unique IDENTIFIABILITY (non duplicate in technical and in real world level) Every record represents one unique real world event or person or thing REFERENTIAL INTEGRITY (relationships inside one DB, enterprise, world) All important relationships exist between related data records CONSISTENCY (non conflict on record level or cross record level) all similar attribute values are not conflict with each other ACCURACY TO SOURCE All loaded and transformed data is correct according to a source TIMELINESS All data is available on the agreed time
Example of Party DQ Rules and Metrics (set of rules) DQ rule code name Rule description Identify ability Complete ness Referential integrity Consis tency Right Format System ID Party must have unique and non-changing system assigned ID which must identify party internaly inside one data base x Name Party must have country based name in original language Type Party must have main type from list (Privat, Legal, goverment, financial institution) Reg Code Privat Privat Party must have register code / social security number Reg Code Privat Format Privat party social security number format must be according the country based rules if Reg Country is EE, LT, LV x Reg Code Legal Legal Party must have register code Reg Country Party must have register country code from list (EE, LV, LT, ...) Reg Code Country code Register code + register country code + party type must be unique and must identify party globaly / externaly Gender Reg code Gender and register code of private party must be consistent if Reg Country is EE, LT, LV x