Download presentation
Presentation is loading. Please wait.
1
Update on EUROSTAT activities
A second hand experience Ekkehard Petri GISCO Eurostat
2
LUCAS Census 2010 SDMX 07 October 2010 EFGS Meeting 2010 Den Haag
3
LUCAS data collection process
points LAND COVER classes 1 ARABLE LAND 2 PERMANENT CROPS 3 GRASSLAND 4 WOODED AREAS AND SHRUBLAND 5 BARE LAND, RARE VEGET. 6 ARTIFICIAL LAND 7 WATER First phase sample for stratification: orthophoto interpretation 2km grid Ground survey Parameters Land cover Land use pictures etc. Sample of around 260,000 pts Second phase sample: in-situ data collection 07 October 2010 EFGS Meeting 2010 Den Haag
4
Second phase sampling design
Sampling strategy: Second phase sampling design Definition of sample size by strata Optimal size by NUTS2 and strata based on fixed precisions for a set of LC classes targeted by country Points selection LUCAS 2006 sample points included as much as possible (land cover/use changes can be detected) Maximisation of the distance between points Exclusion of remote points and points above 1000m 07 October 2010 EFGS Meeting 2010 Den Haag
5
Land Cover nomenclature LUCAS 2009
Artificial Built-up areas A20 Artificial non built-up areas B10 Cereals (+ triticale) B20 Root crops B30 Non permanent industrial crops B40 Dry pulses, vegetables and flowers B50 Fodder crops B70 Fruit trees & berries B8 Other Permanent Crops C10 Broadleaved and evergreen woodland C20 Coniferous woodland C30 Mixed woodland D10 Shrubland with sparse tree cover D20 Shrubland without tree cover E10 Grassland with sparse tree/shrub cover E20 Grassland without tree cover E30 Spontaneous vegetation F00 Bare Land G10 Inland water bodies G20 Inland running water G30 Coastal water bodies G50 Glacier, permanent snow H10 Inland marshes H20 Peatbogs H30 Salt-marshes H40 Salines H50 Intertidal flats A10 Artificial Built-up areas A11 Buildings with one to three floors A12 Buildings with more than three floors A13 Greenhouses A20 Artificial non-built up areas A21 Non built-up area features A22 Non built-up linear features 07 October 2010 EFGS Meeting 2010 Den Haag
6
Land Use nomenclature LUCAS 2009
Agriculture ( + Kitchen garden + Fallow land) U120 Forestry U130 Fishing U140 Mining, Quarrying U150 Hunting U210 Energy production U220 Industry & Manufacturing U310 Transport, communication, … U320 Water & waste treatment U330 Construction U340 Commerce, Finance, Business U350 Community Services U360 Recreation, Leisure, Sport U370 Residential U400 Unused 07 October 2010 EFGS Meeting 2010 Den Haag
7
Data availability Types of data
Tabular microdata on the first and second phase sample (land cover/use on the specific point, LC/LU change in the specific point etc.) Pictures (four cardinal directions) Aggregated estimates (NUTS1/NUTS2 depending on LC classes) Years 2006 2008/2009 (from march 2010 on) Terms of use An agreement has to be signed between DG-ESTAT and users about: Confidentiality: only aggregated data can be disseminated 07 October 2010 EFGS Meeting 2010 Den Haag
8
Data availability per country/year
07 October 2010 EFGS Meeting 2010 Den Haag
9
Census 07 October 2010 EFGS Meeting 2010 Den Haag
10
EU census goals Same reference year (first time: 2011)
Comparability of census data on the EU level Same reference year (first time: 2011) Same ‘topics’ (variables) Use of harmonized definitions and technical specifications Use of identical breakdowns of the topics Unified dissemination programme (hypercubes) =>Common Baseline across countries Transparent quality of census results Quality reports Detailed tables on quality of the data Metadata 07 October 2010 EFGS Meeting 2010 Den Haag
11
EU census limits What does the regulation not provide?
No access to microdata No possibility to define geographical areas flexibly No harmonised confidentiality control No normative minimum quality requirements (quality thresholds) No consolidation of census results form different Member States BUT Member States are free to do more! 07 October 2010 EFGS Meeting 2010 Den Haag
12
What data for what geographical area?
NUTS2: Year of arrival in the country Educational attainment Location of place of work Current activity status Occupation Industry Status in employment Tenure status of households Housing arrangements Type of ownership (of dwellings) Water supply system, Toilet facilities, Bathing facilities, Type of heating 07 October 2010 EFGS Meeting 2010 Den Haag
13
What data for what geographical area ?
LAU 2 Population topics Sex Age Legal marital status Country/place of birth Country of citizenship Place of usual residence one year prior to the census (Size of the) Locality Household status Type of private household Size of private household Family status Type of family nucleus Size of family nucleus Total population Place of usual residence Relationships between household members Housing topics Occupancy status of conventional dwellings Number of occupants Useful floor space and/or Number of rooms Density standard Dwellings by type of building Dwellings by period of construction Type of living quarters Location of living quarters 07 October 2010 EFGS Meeting 2010 Den Haag
14
What we can NOT do for GISCO ?
The municipalities as smallest geographical area for the census data to be transmitted to Eurostat (LAU 2 level) are fixed. No flexibility to define areas freely. After long and detailed consultation with the Census experts from the Member States, the foreseen obligatory statistical programme represents a balance between the desirable and the feasible. Eurostat does not have access to census microdata. Confidentiality control is done by the NSI. 07 October 2010 EFGS Meeting 2010 Den Haag
15
What we can do for GISCO ? Usage of common definitions, technical specifications and breakdowns makes census data better comparable at the European level. Intensive description and quality reporting of the NSI on the data sources and methodology they use to do the population and housing census. This might help to develop small area reporting systems. Key topics will be required for the LAU 2 level. It is likely that some of the data might also be available for even smaller areas in some Member States. Eurostat organizes a task force on Census Data Disclosure Control which aims at proposing best methodology and practice to protect census data with minimum damage to disseminated results. The Census Hub might be used to exchange and disseminate small area data from censuses. 07 October 2010 EFGS Meeting 2010 Den Haag
16
Census Hub project: architecture
WS WS WS database database database 07 October 2010 EFGS Meeting 2010 Den Haag
17
The Census Hub project The Census Hub project aims to build a new IT infrastructure to achieve the data exchange between the National Statistical Institutes (NSI), Eurostat and the users of census data using SDMX standards. Data sharing architecture Based on the agreed hyper-cubes with harmonised data Confidentiality problems handled at national level A data user browses the hub to define a dataset of interest via structural metadata (dimensions, attributes, measures, code lists, etc). Data are retrieved directly from the interested Member States’ systems 07 October 2010 EFGS Meeting 2010 Den Haag
18
Present and ongoing activities
Pilot project in Germany, Ireland, Italy and Portugal Guideline explaining how to implement an SDMX MSs architecture in the Census Hub context available 07 October 2010 EFGS Meeting 2010 Den Haag
19
SDMX 07 October 2010 EFGS Meeting 2010 Den Haag
20
What is SDMX “Statistical Data and Metadata Exchange”
SDMX preferred standard for exchange and sharing of data and metadata in the global statistical community Sponsors include European Central Bank (ECB) Eurostat Organisation for Economic Co-operation and Development (OECD) United Nations Statistical Division (UNSD) The initiative, started in 2001, is sponsored by 7 international organisations: BIS, ECB, Eurostat, IMF, OECD, UN, WB Among other things SDMX Standards are seen as facilitating use of Internet-accessible databases in order to be able to retrieve data as soon as they are released. 07 October 2010 EFGS Meeting 2010 Den Haag
21
Benefits from SDMX standards
Covers potentially all statistical domains Open to all stakeholders Are neutral in terms of underlying commercial technologies Demography and the Census hub already implemented 3) This means that investing in SDMX is a good reason not only for improving the data exchange towards International organizations but also on something useful inside own Institution 07 October 2010 EFGS Meeting 2010 Den Haag
22
SDMX components Information model for data and metadata
Syntax for automatic exchange of data and metadata Guidelines to Harmonise Contents IT Architectures for data exchange IT tools to support implementation and to disseminate SDMX data SDMX is not just a data transmission format… Similarities with INSPIRE are substantial SDMX is not just a data transmission format. It provides a complete solution to model data and metadata to be transmitted together, and a syntax to make data and metadata exchanges automatic. Besides, adopting the SDMX standard gives the chance to harmonise objects commonly used in the statistical world, like code lists, concept descriptors etc. This is the aim of the content oriented guidelines. To facilitate data exchanges, SMDX envisages three different IT architectures (push, pull and hub). Finally, SDMX implementation and dissemination are supported by a set a free IT tools available on the SDMX web site. Note that these tools are not part of the SDMX standards. 07 October 2010 EFGS Meeting 2010 Den Haag 22 2
23
SDMX Components: Information Model
Statistical data Metadata Structural Conceptual Quality Methodology Data exchange process 07 October 2010 EFGS Meeting 2010 Den Haag
24
SDMX Information Model
Provides a way of modelling statistical data, metadata and data exchange processes. Dimensions (ex: country, variable/topic, year) Dataset Structure Definition DSD Code lists Structural Metadata Attributes (ex: unit of measure) Describe SDMX provides a way of modelling statistical data, metadata and data exchange processes. The data (and related metadata) for a particular statistical domain are structured according to a "Data Structure Definition" (DSD, previously known as a "key family"). Structural metadata are those metadata acting as identifiers and descriptors of the data. If we consider a statistical multi-dimensional table, identifiers and descriptors are generally the names of variables or dimensions. Data must be associated to some structural metadata, otherwise it becomes impossible to properly identify, retrieve and browse the data. Descriptors can be dimensions when they identify the data or attributes when they only describe the data. Attributes are metadata about an individual value, a time series or a group of time series. Code lists are defined when dimensions or attributes are coded values. The DSD describes the structure of a particular statistical data flow through a list of dimensions (for example: country, variable/topic, year), a list of "attributes" (for example, unit of measure) and their associated code lists. A "data flow", in this context, is an abstract concept of the data sets, i.e. a structure without any data. While a data structure definition defines dimensions, attributes, measures and associated representation that comprise the valid structure of data and related metadata contained in a data set, the dataflow definition associates a data structure definition with one or more category. This gives a system the ability to state which data sets are to be reported for a given category and which data sets can be reported using the data structure definition. Metadata about an individual value, a time series or a group of time series Data 07 October 2010 EFGS Meeting 2010 Den Haag 24
25
SDMX Components: IT Tools
SDMX Registry Tools to create data definitions and metadata Tools to convert and validate data and metadata Tools to visualise data and metadata Training available from Eurostat 07 October 2010 EFGS Meeting 2010 Den Haag
26
SDMX Registry Repository Structural metadata Provision of information
Graphical User Interface (GUI) for user interaction over the Web Structural metadata Provision of information DSW – “standalone” Java GUI CodeLists Dataflows ConceptSchemes Provision agreements DSDs Accessible via a Web Service accepting SDMX-ML messages Overall the SDMX Registry is an application which is accessible to other programs over the Internet (or an Intranet or Extranet), to provide information needed to facilitate the reporting, collection, and dissemination of statistics. In the same way that a human being might use a browser to go to Google and search for some information, the SDMX Registry can be accessed by computer programs to locate and access statistics. All communications are performed using SDMX-ML XML messages. The SDMX Registry is not a single, centralized resource for the entire world to use. The SDMX registry can be accessed by: User, with an Internet connection and using their CIRCA user ID. Applications like the Dataset Structure Wizard. Any other web applications using web service accepting SDMX-ML messages. 07 October 2010 EFGS Meeting 2010 Den Haag 26
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.