Presentation is loading. Please wait.

Presentation is loading. Please wait.

Max Booleman Statistics Netherlands

Similar presentations


Presentation on theme: "Max Booleman Statistics Netherlands"— Presentation transcript:

1 Max Booleman Statistics Netherlands
Metadata models Max Booleman Statistics Netherlands

2 Content Introduction Functions of metadata Kinds of metadata
Why do we need a metadatamodel? Choosing a model Brief overview different models Communities/platforms Lessons learned

3 Introduction The ‘old’ way The ‘new’ way Special dedicated surveys
Combined complex designs of registrations and samples (minimize administrative burden) Stove-pipe statistics Common input- and outputbases (sharing data) Knowledge in the head of employees Knowledge in documents (metadata) Tailor made tables Common structure

4 Functions of metadata (1)
Input data + transformation = output data Describing data Describing process Describing quality (data and process)

5 Functions of metadata (2)
Information for users, producers inside and outside the office What does it mean? (Automatic) Rules for producers inside the office Ex ante vs ex post: What should you do? What did you do?

6 Kinds of metadata Related to the functions:
Conceptual: describing text, relating elements Process: methods, programs, sequence Quality: norms and indicators (data and process) Technical: the hardware

7 Why a model? We want: Re-use of definitions, classifications, …
Re-use of processes, rules, methods Re-use of data A model facilitates the conceptual level: Structure (coherence) Relations between (data consistency) Meaning of (textual consistency) Processes: metadata driven (machine readable)

8 Properties of a model A good model should: Meet the user needs
Be compact Have a coherent set of metadata object types Model: metadata of the metadata There is no universal model, like there is no universal car.

9 Example What do I need to understand ’21’? Turnover Costs Profit
Trade, Enterprises, 2001 Turnover *1000 euro Costs Profit Size class 1 9 Size class 2 12 total 21 What should be the metadata of ’21’? What do I need to understand ’21’?

10 Example (cont. 2) What do I need to understand ’Turnover’? Turnover
Trade, Enterprises, 2001 Turnover *1000 euro Costs Profit Size class 1 9 Size class 2 12 total 21 What should be the properties of the variable ‘turnover’ (the metadata of ‘Turnover’)? What do I need to understand ’Turnover’?

11 Example (cont. 3) Modelproperties Example name Turnover description
Earnings of an enterprise statistical unit Enterprise period Year relation Turnover=costs + profit measurement unit Euro type of aggregation Sum

12 Remarks (1) Part of the properties? Period ‘Year’/ Name ‘Turnover’
Measurement unit ‘euro’ Versioning (lifecycle) Homonyms/Synonyms

13 Remarks (2) A model is like decomposition of sentences:
The total turnover of enterprises in The Netherlands was in 2008 equal to … billion euro. The total turnover of the enterprise Shell in The Netherlands was in 2008 equal to … euro. A Population of Statistical units at or during a ‘time’ will be described by Variables

14 Remarks (3) Definition of Age, Turnover etc.: in principle unit independent but formulated user friendly. The concept of ‘age’ is the same for electrons, cars, buildings and human beings.

15 Remarks (4) Relation between statistical units:
A student is a kind of a person: inherit properties of person additional (useful) own properties A household contains persons An enterprise contains establishments

16 Remarks (5) Relation between populations:
Income of all persons of one household = Income of the household? Income of all persons = Income of all households? Turnover of all establishments = Turnover of all enterprises? Consolidation?

17 Julius Ceasar Columbus BC AC Present statistics forecast

18 Julius Ceasar Columbus BC AC Present statistics forecast 1-1-2006
Present statistics forecast

19 Julius Ceasar Columbus BC AC Present statistics forecast
Dutch nationality Julius Ceasar Columbus BC AC Present statistics forecast

20 Julius Ceasar Columbus BC AC Present statistics forecast
Dutch nationality Julius Ceasar Columbus Inhabitant of The Netherlands BC AC Present statistics forecast

21 Remarks A population is a collection of statistical units limited in time, area, ….. Could ‘student’ be a statistical unit? ‘Student’ is a kind of ‘person’ so ‘Students’ is formally a subpopulation of ‘persons’ Should we distinguish 5 or 1000 kinds of statistical units?

22 ‘Choosing’ a model Own wishes Checking existing models
Logical, coherent description of input, output (files) Checking existing models Compile own model (compact!) Map to/from existing models Plan-Do-Check-Act

23 Overview (1) XBRL: exchange of micro data (http://www.xbrl.org/Home/)
IMF (GDDS, SDDS) SDMX: exchange of statistical data Push Pull Neuchâtel group (classifications, variables)

24 Overview (2) ISO ( DDI 3.0 ( Dublin Core (

25 Communities/platforms/conferences
Metanet ( Metis ( Working group Eurostat ( Q2008/Q2006/Q2004/Q2001 ( and SDMX XBRL ( CODACMOS (

26 Lessons Learned (1) The ultimate model does not exist (yet?)
Mapping from and to models Start with your own wishes Start with a standard model and adjust 80%-20% rule: don’t try to do everything at once Store only what is in use

27 Lessons Learned (2) Think broad, start small Homonyms and synonyms
Survival of the fitting: using standards should be efficient Adjusting standards often (very) expensive Homonyms and synonyms Formal description is difficult and takes time and effort

28 Questions?


Download ppt "Max Booleman Statistics Netherlands"

Similar presentations


Ads by Google