DWH-Ahsan Abdullah 1 Data Warehousing Lab Lect-2 Lab Data Set Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research FAST National University of Computers & Emerging Sciences, Islamabad
DWH-Ahsan Abdullah 2 Multi-Campus University
DWH-Ahsan Abdullah 3 Degree Programs
DWH-Ahsan Abdullah 4 Disciplines for BS
DWH-Ahsan Abdullah 5 Disciplines for MS
DWH-Ahsan Abdullah 6 The need Head Office wants a central data repository for decision support i.e. a DWH
DWH-Ahsan Abdullah 7 Students Record Keeping & Mgmt.
DWH-Ahsan Abdullah 8 Data from Lahore Campus
DWH-Ahsan Abdullah 9 Data from Lahore Campus: Sample
DWH-Ahsan Abdullah 10 Lahore: Header of Student Table SID St_Name Father_Name
DWH-Ahsan Abdullah 11 Lahore: Header of Student Table Gender Address [Date of Birth] [Reg Date]
DWH-Ahsan Abdullah 12 Lahore: Header of Student Table [Reg Status] [Degree Status] [Last Degree]
DWH-Ahsan Abdullah 13 Lahore: Header of Course Reg. Table SID Degree Semester Course Marks Discipline
DWH-Ahsan Abdullah 14 Lahore: Facts About Data
DWH-Ahsan Abdullah 15 Data from Karachi Campus
DWH-Ahsan Abdullah 16 Data from Karachi Campus: Sample
DWH-Ahsan Abdullah 17 Karachi: Header of Student Table St_ID Name Father DoB M/F DoReg RStatus DStatus Address Qualification
DWH-Ahsan Abdullah 18 Karachi: Header of Course Reg. Table SID: Courses Score Sem Disp Degree (BS/MS) is missing because separate books are maintained, but the issue is critical while loading data Degree (BS/MS) is missing because separate books are maintained, but the issue is critical while loading data
DWH-Ahsan Abdullah 19 Karachi: Facts About Data
DWH-Ahsan Abdullah 20 Data from Islamabad Campus
DWH-Ahsan Abdullah 21 Data from Islamabad Campus: Sample
DWH-Ahsan Abdullah 22 Islamabad: Header of Student Table Roll Num Name Father Reg Date Reg Status Degree Status Date of Birth Education Gender Address
DWH-Ahsan Abdullah 23 Islamabad: Header of Course Reg. Table Roll Num: Course Marks Discipline Session Degree (BS/MS) is missing, whereas same table contains records for both. Only way to differentiate is through discipline attribute. Degree (BS/MS) is missing, whereas same table contains records for both. Only way to differentiate is through discipline attribute.
DWH-Ahsan Abdullah 24 Islamabad: Facts About Data
DWH-Ahsan Abdullah 25 Exercise
DWH-Ahsan Abdullah 26 Problems with Adhoc Approach
DWH-Ahsan Abdullah 27 LAHORE KARACHI ISLAMABAD PESHAWAR Text Files Excel Book MS-ACCESS Text Files Uses Problem-1: Non-Standard data sources
DWH-Ahsan Abdullah 28 Problem-2: Non-standard attributes
DWH-Ahsan Abdullah 29 Problem-3: Non Normalized database
DWH-Ahsan Abdullah 30 Notepad: Issues
DWH-Ahsan Abdullah 31 MS-Excel: Issues
DWH-Ahsan Abdullah 32 MS-Access: Issues
DWH-Ahsan Abdullah 33 Problem Statement
DWH-Ahsan Abdullah 34 Data from Peshawar Campus Data at Peshawar campus is stored in Text files To store data regarding one complete batch 2 text files are used Lhr_Student_batch (Student record) Lhr_Detail_batch (Course Reg. record) 22 text files for 11 BS batches 8 text files for 4 MS batches
DWH-Ahsan Abdullah 35 Data from Peshawar Campus: Sample
DWH-Ahsan Abdullah 36 Peshawar: Header of Student Table Reg#: Student identity Name: Student name Father: Father name Address: Permanent address Date of Birth: Date of Birth lastDeg: Last degree achieved Reg Date: Date of Enrollment Reg Status: Status of Enrollment (A/T) Degree Status: Status of Degree (C/I)
DWH-Ahsan Abdullah 37 Peshawar: Header of Course Reg. Table Reg#: Courses: Course code Score: Out of 100 Program: CS/TC/SE/CE Sem: Fall/Spring Year: YYYY e.g We need to identify semester session (fall04) through combination of Sem and Year We need to identify semester session (fall04) through combination of Sem and Year