Finding Information A337/A523
What are some of the possible problems with finding information?
Information is often lacks STRUCTURE ASSOCIATION between the identifying information (i.e., labels and the actual information is not always obvious) and the data CONSISTENCY is not always present. E.g., (317) May later need to MANIPULATE data (filter, sorting, etc.)
Typical “Office” Applications Word Processing Spreadsheet Database Management System (DBMS)
Spreadsheets and DBMSes Columns (labels) Rows (“instance” or record) Intersection (value) Information often lacks STRUCTURE ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious CONSISTENCY is not always present. E.g., (317) May later need to MANIPULATE data (deeper search, sorting, etc.)
Spreadsheets Tables in MS Excel Information often lacks STRUCTURE ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious CONSISTENCY is not always present. E.g., (317) May later need to MANIPULATE data (deeper search, sorting, etc.)
DBMSes Tables in MS Access Table is one of many objects in a database Easier to associate tables than in a spreadsheet (i.e., vlookup) Tables have several unique properties we’ll discuss later Information often lacks STRUCTURE ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious CONSISTENCY is not always present. E.g., (317) May later need to MANIPULATE data (deeper search, sorting, etc.)
ERP Systems Centralized database eliminates the need to associated data located on separate systems Information often lacks STRUCTURE ASSOCIATION between the identifying information (i.e., labels and the actual information) is not always obvious CONSISTENCY is not always present. E.g., (317) May later need to MANIPULATE data (deeper search, sorting, etc.)
Data Quality: What is Dirty Data? It happens when the UPC code on a package doesn't match the item. Causes? Vendor-Unique product code and cost Retailer-Unique product code and price
Data Quality: What is Dirty Data? Potential Problems? Inventory Reorder Profit per unit Net profit Customer Satisfaction Repeat Business Angry Bloggers Solution: Same code for vendor and retailer Data Integrity: Wal-Mart's Dirty Secret
Extract, Transform, Load (ETL) From Computerworld QuickStudy