Download presentation
Presentation is loading. Please wait.
Published byRandell Grant Modified over 9 years ago
1
IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014
2
Data Capture using Metadata Aim is to demonstrate designing and running a survey questionnaire based entirely on metadata Aim is to use DDI metadata design the questions organise the questions into a questionnaire present the questionnaire capture and save the responses all based entirely on the metadata
3
Using DDI Metadata for Questionnaires DDI has metadata for Questions a simple question goes in a Question Item – What is your age in years? a complex question goes in a Multiple Question Item – Did you do paid work last week? » Full Time or Part Time? » How many hours? o A Multiple Question Item can contain Question Items or other Multiple Question Items
4
Using DDI Metadata for Questionnaires Questions can link to one or more Concepts to indicate what the question is seeking to cover o Age, Sex, Country, Income, Occupation,... o perhaps to qualify what is being covered – eg Non-farm income, Tertiary qualifications
5
Using DDI Metadata for Questionnaires Questions have: Name – just a multi-lingual name, not used in questionnaires Text – the question that is asked – can be conditional, multi-lingual, formatted » can even have mixed language Question Intent – some elaboration about what is being sought » multi-lingual, formatted POC just uses simple unformatted multi-lingual Text
6
Using DDI Metadata for Questionnaires Questions have Response Domains what sort of answer is expected or valid o Numeric domain – can specify integer of decimal, valid formats and ranges, etc o Text domain – can specify format, length o Category Domain – valid list of multi-lingual values » not really very much use o Code Domain – valid list of multi-lingual values with codes » a classification
7
Using DDI Metadata for Questionnaires Questions have Response Domains what sort of answer is expected or valid o Date-Time Domain – can specify formats o Geographic – eg coordinates, other units o Structured Mixed Response Domain – a combination of all of the above all domain type can have labels and descriptions
8
Using DDI Metadata for Questionnaires Questions do not go directly into a questionnaire o DDI calls a questionnaire an Instrument questions constitute a library available for use o a “Question Bank” questions are selected and assembled into an Instrument the assembling of questions is done with Control Constructs an Instrument identifies a single Control Construct that builds the questionnaire
9
Control Constructs Control Constructs are the critical component in building a questionnaire they select the questions they control the flow of the questions – branching and looping they insert non-question text – “Now I want to ask you about other people in the household” they can compute values they link to Interviewer Instructions o structured DDI Interviewer Instructions o unstructured external interviewer instructions material
10
Control Constructs Several types of Control Constructs Question Construct – selects a Question Item or Multiple Question Item Sequence – selects a sequence of other control constructs of any type If-Then-Else – defines an If condition with optional ElseIf clauses (multiple) and optional Else clause » each condition selects a single Control Construct to include
11
Control Constructs Several types of Control Constructs Loop, Repeat-Until, Repeat-While – eg to loop over people in a household Statement Item – inserts non-question multi-lingual text (conditional, formatted) Computation Item – a calculation in some language that is assigned to a Variable
12
Instrument Identifies a single Control Construct to assemble the questionnaire o probably a Sequence construct Instruments can have an Type o a single value taken from some Controlled Vocabulary – a user-managed list of valid values » eg, Paper, Internet, CATI,... Instruments can have multiple Software specifications o basically just identifying “software” used with instrument – not a great deal of use
13
Instrument Instruments do not have any place for useful layout metadata just the type of the layout a fairly serious limitation We need quite a lot of information to do the layout how to represent lists o tick boxes, list boxes, combo boxes, radio buttons how to show flow logic which questions to show at once, which to separate can the respondent backtrack We need additional Layout Metadata I have designed some
14
Interviewer Instructions A formal DDI metadata type Organised, structured instructions formatted multi-lingual text o may be conditional May link to external, non-DDI material eg, PDF, Word documents Not used in this Proof of Concept
15
Classifications DDI holds Classifications as linked Code Schemes and Category Schemes a Category Scheme is a list of Categories o flat list of multi-lingual names and descriptions o eg, Country names, Occupation names, etc a Code Schemes selects Categories from Category Schemes, assigns a Code (not multi-lingual), and may specify a hierarchy o a Code Scheme may select Categories from multiple Category Schemes o multiple Code Schemes may select the same Categories
16
Code Schemes and Category Schemes Used for Classifications – a Classification is a Code Scheme Controlled Vocabularies – lists of standardised terms » defined by DDI, an organisation, a local area
17
Code Schemes and Category Schemes Used for Response Domains for Questions – Code Domains and Category Domains – Category domains are not much use in a multi-lingual environment » Categories have different names in different languages with no unique handle except a meaningless Id Representations for Variables – Code Representation
18
Variables A Variable is a container that will hold a data value has a Name and Description (both multi-lingual) can be linked to a single Concept – to indicate what the data represents can be linked to multiple Questions – to indicate where the data comes might come from can have a Representation – Code, Date/Time, Numeric, Text » with constraints on values can identify a Response Unit and an Analysis Unit – a population that it can apply to
19
Logical Record A Logical record consists of a sequence of Variables o groups data values for a purpose – data from a questionnaire goes into one or more Logical Records o Logical Records can be linked – eg, Households and Persons o Logical Records are independent of any storage or stored format
20
Record Layouts and Physical Structures Map a Logical record to a physical record and an actual stored file format Can support a very wide range of structures and storage formats CSV, Binary file, XML, database multiple record types, linkages of many kinds POC does not actually use this Simple CSV file maps directly from Logical record
21
Physical Instance Holds information about actual data sets produced links to Physical Structures, Record Layouts, and Logical records provides a central management of data from a collection POC uses Physical Instance to manage data o POC 2.3.3 builds on this POC to show how to use SDMX and DDI metadata together – produces tables from SDMX DSD using data collected with DDI » uses the Physical Instance information to find the datasets
22
What does the POC do? Collects survey data based entirely on metadata builds or imports all the metadata assembles a survey instrument (questionnaire) presents the questionnaire in a Windows Form collects data into Logical Records saves the data in CSV files POC 2.3.3 builds on this POC duplicates Concepts and Classifications to SDMX o tightly-coupled set of metadata uses SDMX DSD to produces tables from the collected data
23
How does the POC do This? Basically POC system is a metadata creator/editor Build, import Concepts o build in UI, import from CSV, SDMX V2.0 Import Classifications o import CSV, SDMX V2.0 o did not implement build in UI Construct Questions in UI o multi-lingual text with links to Concepts o POC almost supports Multiple Question Items – by the end of the week
24
How does the POC do This? Build Variables o links to Concepts and Questions Build Control Constructs o Question constructs » link to a question o Sequence constructs » define a sequence of other Control Constructs o If-Then-Else constructs » allows conditional questionnaire flow o POC does not support Loop, Repeat-Until, Repeat-While, Computation Item, and Statement Item constructs
25
How does the POC do This? Define Instruments o link to a single Control Construct – probably to a Sequence construct o has an instrument type – Windows Form, Web, Paper, CATI,.. – POC only supports Windows Form Define Logical Records o collection of Variables Map Logical record to Instrument o map Variables to Questions – uses Concept links and question links if present – allows user override in UI
26
How does the POC do This? Present the Instrument o Render the instrument in a Windows Form – using some layout metadata I made up o Execute the Control Constructs to select questions and manage question flow o present questions in list box, combo box, radio buttons, text box – depending on response Domain and some Layout Metadata » not DDI metadata, my design o capture the responses into a Logical record – based on the Logical Record – Instrument mapping o present questions in language of choice » limited choice
27
How does the POC do This? Retrieve Logical Record set o run interview multiple times to get a set of logical records o results displayed on screen in tabular form o mapping as defined in Response Domains and Logical record to Instrument map Save the Logical records in data file o CSV file o no actual Record Layout and Physical Structure metadata – simplest Record Layout and Physical Structure metadata is almost empty anyway » it is just saving the Logical Records with Variables separated by commas
28
How does the POC do This? Save details of CSV data sets in Physical Instance metadata so the data can be found for subsequent operations o like producing tables in POC 2.3.3
29
Metadata is in a DDI Instance file Group DDI Instance Study Unit Concepts Code Schemes and Category Schemes Variables Control Constructs Logical Records Physical Instance Instruments Questions Layout Metadata file Layout Metadata
30
DDI is fairly complex But so is designing a Questionnaire! Constructing questionnaire involves constructing a lot of metadata but DDI is fairly logical you need to think about what you want in the questionnaire and how you want it to flow o but you need to do this if you are designing a questionnaire manually once you design questionnaire you get re-use advantages o easy to modify, add languages o easy to adapt to some other purposes o easy to reuse useful questions and constructs o easy to capture the data
31
Let us look at my test questionnaire Simple questions about internet access based on Eurostat ICT survey includes two If-Then-Else constructs to manage flow we will look at the structure first o a good way to plan your own questionnaire o a good way to see how the DDI metadata works
32
ICT question flow 1 Country A1 Do you have access to a computer at home? 2 Sex 3 Age A2 Do you have access to the internet at home? B1 When did you last use a computer? If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average?
33
ICT question flow 1 Country ------- QC 1 A1 Do you have access to a computer at home? ------- QC 4 2 Sex ------- QC 2 3 Age ------- QC 3 A2 Do you have access to the internet at home? ------- QC 5 B1 When did you last use a computer? ------- QC 6 ------- QC 7 ------- QC 8 ------- QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct
34
ICT question flow 1 Country ------- QC 1 A1 Do you have access to a computer at home? ------- QC 4 2 Sex ------- QC 2 3 Age ------- QC 3 A2 Do you have access to the internet at home? ------- QC 5 B1 When did you last use a computer? ------- QC 6 ------- QC 7 ------- QC 8 ------- QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct If C 1 If C 2 If C – If-ThenElse Construct
35
ICT question flow 1 Country ------- QC 1 A1 Do you have access to a computer at home? ------- QC 4 2 Sex ------- QC 2 3 Age ------- QC 3 A2 Do you have access to the internet at home? ------- QC 5 B1 When did you last use a computer? ------- QC 6 ------- QC 7 ------- QC 8 ------- QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct If C 1 If C 2 If C – If-ThenElse Construct Seq C 1 Seq C – Sequence Construct
36
Let us have a look at the POC Demo
37
What does the POC show?
38
It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system
39
What does the POC show? It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system Designing a survey questionnaire in DDI is fairly complex So is designing one by manually
40
What does the POC show? It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system Designing a survey questionnaire in DDI is fairly complex So is designing one by manually There are some clear advantages modification and reuse is easy multi-lingual presentation is easy
41
What does the POC show? Really need some form of Layout Metadata DDI Instrument and Control Construct metadata gives no guidance on layout POC case was very simple o but still needed some layout information o realistic survey questionnaire needs considerable layout information needs more thought to design this metadata
42
What does the POC show? POC used a Windows Form questionnaire no practical use for production survey but other questionnaire formats should be easy o hope to have paper example (in Word) by end of week o script for a Web Form questionnaire is straight-forward – but Web Form system still needs to process Control Constructs o script for CATI is easy o script for Blaise should be easy easy to have questionnaire available in multiple formats
43
Thank you Questions? Bryan Fitzpatrick Rapanea Consulting Limited BryanMFitzpatrick@Yahoo.CO.UK Ph +44-7789-886536 BryanMFitzpatrick@Yahoo.CO.UK
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.