Presentation is loading. Please wait.

Presentation is loading. Please wait.

Construction of a database Per Weidenman PAR AB Database A collection of data It belongs together It models the ”world” Database management system (DBMS)

Similar presentations


Presentation on theme: "Construction of a database Per Weidenman PAR AB Database A collection of data It belongs together It models the ”world” Database management system (DBMS)"— Presentation transcript:

1

2 Construction of a database Per Weidenman PAR AB

3 Database A collection of data It belongs together It models the ”world” Database management system (DBMS) The database (a collection of interrelated data) Software to manage and access the data

4 DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements ”Database” Data Warehouse etc.

5 Database management systems (DBMS) Microsoft Access Microsoft SQL Server DB2 Oracle MySQL FirebirdSQL etc. SQL – Structured Query Language A computer language to define and search data

6 Relational databases Tables containing data, organised in rows and columns Keys, used for linking data in different tables

7 Example Simple database for collecting and organising statistical papers Created in Microsoft Access

8 Paper name and details Link to dokument (pdf file) Autors

9 A database with four tables Keys

10 One of the tables, containing paper name and details One paper on each row Rows containing paper name and other details Key

11 The keys are used to link data in the four tables

12 123456789123456789 111…56111…56 123…44123…44 12341234 Aaaa Bbbb Cccc Dddd Table ”artiklar” Table ”författare2” Table ”personer2” Key: artikel_id Key: artikel_id Key: person_id Key: person_id One paper having 3 autors One person being the autor of 2 papers

13 A query: the result of asking the database about papers and autors One paper and the corresponding 3 autors One autor and the corresponding 2 papers

14 DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements ”Database” Data Warehouse etc. IT Department ”Business” users

15 DBMS requirements from a statistical / analytical viewpoint Data quality Data types Performance Maximun information Historical data Regulation and secrecy

16 DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Instead of entering text/data by typing… Input: transactions Sales System X Enter customer name: User: Searching Reporting Sales System X Choose customer name: Volvo Personvagnar AB Volvo Lastvagnar AB Volvo Construction AB Volvo Bussar AB Volvo Logistics AB … … use, if possible, selection from a list of valid values

17 DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Input: transactions User: Searching Reporting Sales System X Enter customer age: Define rules for valid input (values, intervals, etc.) We dont want: Negative values 40+ 1982

18 DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Input: transactions User: Searching Reporting Handling of missing values … Missing values should stored as ”null” in the database. Not as 0 (digit zero)

19 DBMS requirements from a statistical / analytical viewpoint Data types Text Numeric

20 DBMS requirements from a statistical / analytical viewpoint Performance DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements Searching for individual records Creating ”prepared” reports by counting or summing Large datasets Multivariate methods Iterative estimation Etc.

21 DBMS requirements from a statistical / analytical viewpoint DBMS Maximum information Input: transactions User: Searching Reporting Sales System X Enter customer age: 34 We need to report on age groups: 20-29 30-39 40-49 … Thus we store age as an interval, not as a value! The fallacy of beeing too user oriented!

22 DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Sales System X Customer name: Customer address: Order date: Order value: Table: Orders Customer ID Order date Order value Each new order for a specific customer … … will be added to table Orders and stored as a ”new row”

23 DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Sales System X Customer name: Customer address: Order date: Order value: Table: Customers Customer ID Customer name Customer address But a new address … … will probably UPDATE the existing record (row) for the specific customer Thus, the old value of ”customer address” will be deleted and replaced with the new value. But this will do fine for users focusing on searching / reporting!

24 DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Customer ID Customer name Customer address Table: Customers Table: Customers_history Customer ID Customer name Customer address FromTo Create av new table to contain historic records Each time a value is UPDATED for a certain customer … … the complete (previous) record is transfered to the table Customers_history

25 DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Customer ID Customer name Customer address Table: Customers Table: Customers_history Customer ID Customer name Customer address FromTo This structure will make analysis of processes possible But not easy!

26 DBMS requirements from a statistical / analytical viewpoint Regulation and sectrecy

27 DBMS requirements from a statistical / analytical viewpoint Current data Current + historical data Operating on individual records Operating on many records Next on this channel…

28 DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements A database containing historic transactions

29 Board data PAR / Bisnode database Tables Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc. Historic company data Many records per company. Contains the accumulated historic records from table FTG Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc. Board member data Many records per company and person. FTG FTG_H BOKSLUT FUNKTION_ PERIOD And many more tables! Serrano Statistical analysis How? Historic names etc. Sampling for times series statistics

30 END

31 Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc.

32 Historic company data Many records per company. Contains the accumulated historic records from table FTG

33 Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc.

34 Board member data Many records per company and person.

35 Serrano Balance sheet data from different periods transformed to yearly data records

36 Serrano Historic transactions from FTG_H transformed to yearly data records

37

38 Serrano Board Data Balance member data from any mix of startdate, enddata and period length transformed to yearly data records

39 Summing up register data to annual figures A ÅR Nu 321 Exampel. Register containing balance sheet data: Number of employes Turnover Profit Tangible assets Etc. Exampel. Register containing balance sheet data: Number of employes Turnover Profit Tangible assets Etc.

40 A ÅR Nu 321 B Brutet räkenskapsår Summing up register data to annual figures

41 A ÅR Nu 321 B C Omlagda räkenskapsår Summing up register data to annual figures

42 A ÅR Nu 321 B C D Missing data Summing up register data to annual figures

43 ÅR Nu 321 B Förslag: Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden … Förslag: Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden … Summing up register data to annual figures

44 ÅR Nu 321 B Förslag: … och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde. Förslag: … och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde. Förslag: … samt imputera för full täckning under sista året Förslag: … samt imputera för full täckning under sista året Summing up register data to annual figures

45 ÅR Nu 321 B Summing up register data to annual figures Database

46 First exampel Register based transport statistics for SIKA: Decreased response burden Increased understanding of the transporting companies (as a complement to the ”usual” fokus on type of goods) Time series describing economic status and change.

47 Objective: Describing economic status and change in transporting companies during the last ten years. Total number of employes and turnover …

48 Objective: Describing economic status and change in transporting companies during the last ten years. … or turnover growth compared to BNP

49 Objective: Describing economic status and change in transporting companies during the last ten years. … or profit development for different types of freight companies

50 Objective: Describing economic status and change in transporting companies during the last ten years. … or the number of employes in a cohort of new companies.

51 Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 Tables based on balance sheet data from each company

52 Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 What data is needed? Company data including micro level history. Exactly which companies where active in transport during each year? Company data including micro level history. Exactly which companies where active in transport during each year? Balance sheet data from all transporting companies for each year

53 Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 What data is needed? Company data including micro level history. Exactly which companies where active in transport during each year? Company data including micro level history. Exactly which companies where active in transport during each year? Balance sheet data from all transporting companies for each year Faster access to ”last years” data compared to taxation based registers

54 A ÅR Nu B C D 321 Sampling companies for time series statistics

55 A ÅR Nu B C D 321 Sampling companies for time series statistics

56 A ÅR Nu B C D 321 Sampling companies for time series statistics

57 A ÅR Nu B C D 321 ACDACD ABCDABCD ABCABC Sampling companies for time series statistics


Download ppt "Construction of a database Per Weidenman PAR AB Database A collection of data It belongs together It models the ”world” Database management system (DBMS)"

Similar presentations


Ads by Google