Download presentation
Presentation is loading. Please wait.
Published byQuinn Fair Modified over 9 years ago
2
Construction of a database Per Weidenman PAR AB
3
Database A collection of data It belongs together It models the ”world” Database management system (DBMS) The database (a collection of interrelated data) Software to manage and access the data
4
DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements ”Database” Data Warehouse etc.
5
Database management systems (DBMS) Microsoft Access Microsoft SQL Server DB2 Oracle MySQL FirebirdSQL etc. SQL – Structured Query Language A computer language to define and search data
6
Relational databases Tables containing data, organised in rows and columns Keys, used for linking data in different tables
7
Example Simple database for collecting and organising statistical papers Created in Microsoft Access
8
Paper name and details Link to dokument (pdf file) Autors
9
A database with four tables Keys
10
One of the tables, containing paper name and details One paper on each row Rows containing paper name and other details Key
11
The keys are used to link data in the four tables
12
123456789123456789 111…56111…56 123…44123…44 12341234 Aaaa Bbbb Cccc Dddd Table ”artiklar” Table ”författare2” Table ”personer2” Key: artikel_id Key: artikel_id Key: person_id Key: person_id One paper having 3 autors One person being the autor of 2 papers
13
A query: the result of asking the database about papers and autors One paper and the corresponding 3 autors One autor and the corresponding 2 papers
14
DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements ”Database” Data Warehouse etc. IT Department ”Business” users
15
DBMS requirements from a statistical / analytical viewpoint Data quality Data types Performance Maximun information Historical data Regulation and secrecy
16
DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Instead of entering text/data by typing… Input: transactions Sales System X Enter customer name: User: Searching Reporting Sales System X Choose customer name: Volvo Personvagnar AB Volvo Lastvagnar AB Volvo Construction AB Volvo Bussar AB Volvo Logistics AB … … use, if possible, selection from a list of valid values
17
DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Input: transactions User: Searching Reporting Sales System X Enter customer age: Define rules for valid input (values, intervals, etc.) We dont want: Negative values 40+ 1982
18
DBMS requirements from a statistical / analytical viewpoint DBMS Data quality Input: transactions User: Searching Reporting Handling of missing values … Missing values should stored as ”null” in the database. Not as 0 (digit zero)
19
DBMS requirements from a statistical / analytical viewpoint Data types Text Numeric
20
DBMS requirements from a statistical / analytical viewpoint Performance DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements Searching for individual records Creating ”prepared” reports by counting or summing Large datasets Multivariate methods Iterative estimation Etc.
21
DBMS requirements from a statistical / analytical viewpoint DBMS Maximum information Input: transactions User: Searching Reporting Sales System X Enter customer age: 34 We need to report on age groups: 20-29 30-39 40-49 … Thus we store age as an interval, not as a value! The fallacy of beeing too user oriented!
22
DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Sales System X Customer name: Customer address: Order date: Order value: Table: Orders Customer ID Order date Order value Each new order for a specific customer … … will be added to table Orders and stored as a ”new row”
23
DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Sales System X Customer name: Customer address: Order date: Order value: Table: Customers Customer ID Customer name Customer address But a new address … … will probably UPDATE the existing record (row) for the specific customer Thus, the old value of ”customer address” will be deleted and replaced with the new value. But this will do fine for users focusing on searching / reporting!
24
DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Customer ID Customer name Customer address Table: Customers Table: Customers_history Customer ID Customer name Customer address FromTo Create av new table to contain historic records Each time a value is UPDATED for a certain customer … … the complete (previous) record is transfered to the table Customers_history
25
DBMS requirements from a statistical / analytical viewpoint DBMS Historical data Input: transactions User: Searching Reporting Customer ID Customer name Customer address Table: Customers Table: Customers_history Customer ID Customer name Customer address FromTo This structure will make analysis of processes possible But not easy!
26
DBMS requirements from a statistical / analytical viewpoint Regulation and sectrecy
27
DBMS requirements from a statistical / analytical viewpoint Current data Current + historical data Operating on individual records Operating on many records Next on this channel…
28
DBMS Input: transactions User: Searching Reporting Statistical analysis Organised data DBMS requirements A database containing historic transactions
29
Board data PAR / Bisnode database Tables Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc. Historic company data Many records per company. Contains the accumulated historic records from table FTG Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc. Board member data Many records per company and person. FTG FTG_H BOKSLUT FUNKTION_ PERIOD And many more tables! Serrano Statistical analysis How? Historic names etc. Sampling for times series statistics
30
END
31
Basic company data One record per company. Contains name, address, startdate, enddate, line of business, etc.
32
Historic company data Many records per company. Contains the accumulated historic records from table FTG
33
Balance sheet data One record per annual report (thus many records per company). Turnover, profit, key ratios, etc.
34
Board member data Many records per company and person.
35
Serrano Balance sheet data from different periods transformed to yearly data records
36
Serrano Historic transactions from FTG_H transformed to yearly data records
38
Serrano Board Data Balance member data from any mix of startdate, enddata and period length transformed to yearly data records
39
Summing up register data to annual figures A ÅR Nu 321 Exampel. Register containing balance sheet data: Number of employes Turnover Profit Tangible assets Etc. Exampel. Register containing balance sheet data: Number of employes Turnover Profit Tangible assets Etc.
40
A ÅR Nu 321 B Brutet räkenskapsår Summing up register data to annual figures
41
A ÅR Nu 321 B C Omlagda räkenskapsår Summing up register data to annual figures
42
A ÅR Nu 321 B C D Missing data Summing up register data to annual figures
43
ÅR Nu 321 B Förslag: Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden … Förslag: Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden … Summing up register data to annual figures
44
ÅR Nu 321 B Förslag: … och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde. Förslag: … och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde. Förslag: … samt imputera för full täckning under sista året Förslag: … samt imputera för full täckning under sista året Summing up register data to annual figures
45
ÅR Nu 321 B Summing up register data to annual figures Database
46
First exampel Register based transport statistics for SIKA: Decreased response burden Increased understanding of the transporting companies (as a complement to the ”usual” fokus on type of goods) Time series describing economic status and change.
47
Objective: Describing economic status and change in transporting companies during the last ten years. Total number of employes and turnover …
48
Objective: Describing economic status and change in transporting companies during the last ten years. … or turnover growth compared to BNP
49
Objective: Describing economic status and change in transporting companies during the last ten years. … or profit development for different types of freight companies
50
Objective: Describing economic status and change in transporting companies during the last ten years. … or the number of employes in a cohort of new companies.
51
Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 Tables based on balance sheet data from each company
52
Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 What data is needed? Company data including micro level history. Exactly which companies where active in transport during each year? Company data including micro level history. Exactly which companies where active in transport during each year? Balance sheet data from all transporting companies for each year
53
Aktiva företagAktiva aktiebolagBNP ÅrTotalt Därav aktie- bolag Antal anställda Nettoom- sättning (Mkr) Löpande priser (Mkr) 19971291210599982591202841927001 199812788106261006631277452012091 199912547105431025311330782123971 200012562107041068111454962249987 200112383106591126851634182326176 200212432107411144261682142420761 200312616109351151351782942515150 200412689110671180151889132624964 200512709111001193872098192735218 200612514110121216832242252899653 What data is needed? Company data including micro level history. Exactly which companies where active in transport during each year? Company data including micro level history. Exactly which companies where active in transport during each year? Balance sheet data from all transporting companies for each year Faster access to ”last years” data compared to taxation based registers
54
A ÅR Nu B C D 321 Sampling companies for time series statistics
55
A ÅR Nu B C D 321 Sampling companies for time series statistics
56
A ÅR Nu B C D 321 Sampling companies for time series statistics
57
A ÅR Nu B C D 321 ACDACD ABCDABCD ABCABC Sampling companies for time series statistics
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.