IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Slides:



Advertisements
Similar presentations
SDMX in the Vietnam Ministry of Planning and Investment - A Data Model to Manage Metadata and Data ETV2 Component 5 – Facilitating better decision-making.
Advertisements

Organisation Of Data (1) Database Theory
Understanding Relational Databases Basic Concepts and Applications for Qualitative Content Analysis.
Mobile Surveyor A Windows PDA/Mobile based survey Software for easy, fast and error free data collection.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
Computer Programming Rattapoom Waranusast Department of Electrical and Computer Engineering Faculty of Engineering, Naresuan University.
The Variable Explosion Or how the DDI variable spread out to inhabit multiple modules in DDI 3.0.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
IASSIST Conference 2006 – Ann Arbor, May Metadata as report and support A case for distinguishing expected from fielded metadata Reto Hadorn S I.
Codebook Centric to Life-Cycle Centric In the beginning….
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
Chapter 1 Program Design
Attribute databases. GIS Definition Diagram Output Query Results.
 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.
Tutorial 11: Connecting to External Data
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Automation Repository - QTP Tutorials Made Easy The Zero th Step TEST AUTOMATION AND QTP.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Background Data validation, a critical issue for the E.S.S.
Topics Covered: Data preparation Data preparation Data capturing Data capturing Data verification and validation Data verification and validation Data.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
What is Sure BDCs? BDC stands for Batch Data Communication and is also known as Batch Input. It is a technique for mass input of data into SAP by simulating.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Introduction to Microsoft Access 2003 Mr. A. Craig Dixon CIS 100: Introduction to Computers Spring 2006.
CHRIS NELSON METADATA TECHNOLOGY WORK SESSION ON STATISTICAL METADATA GENEVA 6-8 MAY 2013 Designing a Metadata Repository Metadata Technology Ltd.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
C++ Implementation ( Version 1 – Text Interface ) Elimination of services of our system. Elimination of services of our system. General Flow of the program.
© 2007 by Prentice Hall 1 Introduction to databases.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Chapter 8 Collecting Data with Forms. Chapter 8 Lessons Introduction 1.Plan and create a form 2.Edit and format a form 3.Work with form objects 4.Test.
Key Applications Module Lesson 21 — Access Essentials
1 CS 350 Data Structures Chaminade University of Honolulu.
Chapter 17 Creating a Database.
 Three-Schema Architecture Three-Schema Architecture  Internal Level Internal Level  Conceptual Level Conceptual Level  External Level External Level.
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
ITGS Databases.
+ Information Systems and Databases 2.2 Organisation.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
Grade Book Database Presentation Jeanne Winstead CINS 137.
Database Management Systems (DBMS)
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Survey Data Management and the Combined Use of DDI and SDMX Arofan Gregory Chris Nelson Metadata Technology Eurostat, June
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
ABS Issues for DDI Futures Bryan Fitzpatrick October 2012.
Objectives Understand Corrective, Perfective and Preventive maintenance Discuss the general concepts of software configuration management.
Database (Microsoft Access). Database A database is an organized collection of related data about a specific topic or purpose. Examples of databases include:
>> Metadata What is it, and what could it be? EU Twinning Project Activity E.2 26 May 2013.
N5 Databases Notes Information Systems Design & Development: Structures and links.
GO! with Microsoft Office 2016
End-to-End Management of the Statistical Process An Initiative by ABS
GO! with Microsoft Access 2016
REDCap Data Migration from CSV file
What’s New in Colectica 5.3 Part 1
SDMX Information Model
Basic Concepts in Data Management
SDMX: A brief introduction
Introduction to Database Programs
SDMX Information Model: An Introduction
Question Banks, Reusability, and DDI 3.2 (Use Parameters)
Spreadsheets, Modelling & Databases
DATABASES WHAT IS A DATABASE?
Prepared by Peter Boško, Luxembourg June 2012
The ultimate in data organization
Introduction to Database Programs
Introduction to DDI Mogens Grosen Nielsen,
Presentation transcript:

IMS Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014

Data Capture using Metadata Aim is to demonstrate designing and running a survey questionnaire based entirely on metadata Aim is to use DDI metadata design the questions organise the questions into a questionnaire present the questionnaire capture and save the responses all based entirely on the metadata

Using DDI Metadata for Questionnaires DDI has metadata for Questions a simple question goes in a Question Item – What is your age in years? a complex question goes in a Multiple Question Item – Did you do paid work last week? » Full Time or Part Time? » How many hours? o A Multiple Question Item can contain Question Items or other Multiple Question Items

Using DDI Metadata for Questionnaires Questions can link to one or more Concepts to indicate what the question is seeking to cover o Age, Sex, Country, Income, Occupation,... o perhaps to qualify what is being covered – eg Non-farm income, Tertiary qualifications

Using DDI Metadata for Questionnaires Questions have: Name – just a multi-lingual name, not used in questionnaires Text – the question that is asked – can be conditional, multi-lingual, formatted » can even have mixed language Question Intent – some elaboration about what is being sought » multi-lingual, formatted POC just uses simple unformatted multi-lingual Text

Using DDI Metadata for Questionnaires Questions have Response Domains what sort of answer is expected or valid o Numeric domain – can specify integer of decimal, valid formats and ranges, etc o Text domain – can specify format, length o Category Domain – valid list of multi-lingual values » not really very much use o Code Domain – valid list of multi-lingual values with codes » a classification

Using DDI Metadata for Questionnaires Questions have Response Domains what sort of answer is expected or valid o Date-Time Domain – can specify formats o Geographic – eg coordinates, other units o Structured Mixed Response Domain – a combination of all of the above all domain type can have labels and descriptions

Using DDI Metadata for Questionnaires Questions do not go directly into a questionnaire o DDI calls a questionnaire an Instrument questions constitute a library available for use o a “Question Bank” questions are selected and assembled into an Instrument the assembling of questions is done with Control Constructs an Instrument identifies a single Control Construct that builds the questionnaire

Control Constructs Control Constructs are the critical component in building a questionnaire they select the questions they control the flow of the questions – branching and looping they insert non-question text – “Now I want to ask you about other people in the household” they can compute values they link to Interviewer Instructions o structured DDI Interviewer Instructions o unstructured external interviewer instructions material

Control Constructs Several types of Control Constructs Question Construct – selects a Question Item or Multiple Question Item Sequence – selects a sequence of other control constructs of any type If-Then-Else – defines an If condition with optional ElseIf clauses (multiple) and optional Else clause » each condition selects a single Control Construct to include

Control Constructs Several types of Control Constructs Loop, Repeat-Until, Repeat-While – eg to loop over people in a household Statement Item – inserts non-question multi-lingual text (conditional, formatted) Computation Item – a calculation in some language that is assigned to a Variable

Instrument Identifies a single Control Construct to assemble the questionnaire o probably a Sequence construct Instruments can have an Type o a single value taken from some Controlled Vocabulary – a user-managed list of valid values » eg, Paper, Internet, CATI,... Instruments can have multiple Software specifications o basically just identifying “software” used with instrument – not a great deal of use

Instrument Instruments do not have any place for useful layout metadata just the type of the layout a fairly serious limitation We need quite a lot of information to do the layout how to represent lists o tick boxes, list boxes, combo boxes, radio buttons how to show flow logic which questions to show at once, which to separate can the respondent backtrack We need additional Layout Metadata I have designed some

Interviewer Instructions A formal DDI metadata type Organised, structured instructions formatted multi-lingual text o may be conditional May link to external, non-DDI material eg, PDF, Word documents Not used in this Proof of Concept

Classifications DDI holds Classifications as linked Code Schemes and Category Schemes a Category Scheme is a list of Categories o flat list of multi-lingual names and descriptions o eg, Country names, Occupation names, etc a Code Schemes selects Categories from Category Schemes, assigns a Code (not multi-lingual), and may specify a hierarchy o a Code Scheme may select Categories from multiple Category Schemes o multiple Code Schemes may select the same Categories

Code Schemes and Category Schemes Used for Classifications – a Classification is a Code Scheme Controlled Vocabularies – lists of standardised terms » defined by DDI, an organisation, a local area

Code Schemes and Category Schemes Used for Response Domains for Questions – Code Domains and Category Domains – Category domains are not much use in a multi-lingual environment » Categories have different names in different languages with no unique handle except a meaningless Id Representations for Variables – Code Representation

Variables A Variable is a container that will hold a data value has a Name and Description (both multi-lingual) can be linked to a single Concept – to indicate what the data represents can be linked to multiple Questions – to indicate where the data comes might come from can have a Representation – Code, Date/Time, Numeric, Text » with constraints on values can identify a Response Unit and an Analysis Unit – a population that it can apply to

Logical Record A Logical record consists of a sequence of Variables o groups data values for a purpose – data from a questionnaire goes into one or more Logical Records o Logical Records can be linked – eg, Households and Persons o Logical Records are independent of any storage or stored format

Record Layouts and Physical Structures Map a Logical record to a physical record and an actual stored file format Can support a very wide range of structures and storage formats CSV, Binary file, XML, database multiple record types, linkages of many kinds POC does not actually use this Simple CSV file maps directly from Logical record

Physical Instance Holds information about actual data sets produced links to Physical Structures, Record Layouts, and Logical records provides a central management of data from a collection POC uses Physical Instance to manage data o POC builds on this POC to show how to use SDMX and DDI metadata together – produces tables from SDMX DSD using data collected with DDI » uses the Physical Instance information to find the datasets

What does the POC do? Collects survey data based entirely on metadata builds or imports all the metadata assembles a survey instrument (questionnaire) presents the questionnaire in a Windows Form collects data into Logical Records saves the data in CSV files POC builds on this POC duplicates Concepts and Classifications to SDMX o tightly-coupled set of metadata uses SDMX DSD to produces tables from the collected data

How does the POC do This? Basically POC system is a metadata creator/editor Build, import Concepts o build in UI, import from CSV, SDMX V2.0 Import Classifications o import CSV, SDMX V2.0 o did not implement build in UI Construct Questions in UI o multi-lingual text with links to Concepts o POC almost supports Multiple Question Items – by the end of the week

How does the POC do This? Build Variables o links to Concepts and Questions Build Control Constructs o Question constructs » link to a question o Sequence constructs » define a sequence of other Control Constructs o If-Then-Else constructs » allows conditional questionnaire flow o POC does not support Loop, Repeat-Until, Repeat-While, Computation Item, and Statement Item constructs

How does the POC do This? Define Instruments o link to a single Control Construct – probably to a Sequence construct o has an instrument type – Windows Form, Web, Paper, CATI,.. – POC only supports Windows Form Define Logical Records o collection of Variables Map Logical record to Instrument o map Variables to Questions – uses Concept links and question links if present – allows user override in UI

How does the POC do This? Present the Instrument o Render the instrument in a Windows Form – using some layout metadata I made up o Execute the Control Constructs to select questions and manage question flow o present questions in list box, combo box, radio buttons, text box – depending on response Domain and some Layout Metadata » not DDI metadata, my design o capture the responses into a Logical record – based on the Logical Record – Instrument mapping o present questions in language of choice » limited choice

How does the POC do This? Retrieve Logical Record set o run interview multiple times to get a set of logical records o results displayed on screen in tabular form o mapping as defined in Response Domains and Logical record to Instrument map Save the Logical records in data file o CSV file o no actual Record Layout and Physical Structure metadata – simplest Record Layout and Physical Structure metadata is almost empty anyway » it is just saving the Logical Records with Variables separated by commas

How does the POC do This? Save details of CSV data sets in Physical Instance metadata so the data can be found for subsequent operations o like producing tables in POC 2.3.3

Metadata is in a DDI Instance file Group DDI Instance Study Unit Concepts Code Schemes and Category Schemes Variables Control Constructs Logical Records Physical Instance Instruments Questions Layout Metadata file Layout Metadata

DDI is fairly complex But so is designing a Questionnaire! Constructing questionnaire involves constructing a lot of metadata but DDI is fairly logical you need to think about what you want in the questionnaire and how you want it to flow o but you need to do this if you are designing a questionnaire manually once you design questionnaire you get re-use advantages o easy to modify, add languages o easy to adapt to some other purposes o easy to reuse useful questions and constructs o easy to capture the data

Let us look at my test questionnaire Simple questions about internet access based on Eurostat ICT survey includes two If-Then-Else constructs to manage flow we will look at the structure first o a good way to plan your own questionnaire o a good way to see how the DDI metadata works

ICT question flow 1 Country A1 Do you have access to a computer at home? 2 Sex 3 Age A2 Do you have access to the internet at home? B1 When did you last use a computer? If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average?

ICT question flow 1 Country QC 1 A1 Do you have access to a computer at home? QC 4 2 Sex QC 2 3 Age QC 3 A2 Do you have access to the internet at home? QC 5 B1 When did you last use a computer? QC QC QC QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct

ICT question flow 1 Country QC 1 A1 Do you have access to a computer at home? QC 4 2 Sex QC 2 3 Age QC 3 A2 Do you have access to the internet at home? QC 5 B1 When did you last use a computer? QC QC QC QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct If C 1 If C 2 If C – If-ThenElse Construct

ICT question flow 1 Country QC 1 A1 Do you have access to a computer at home? QC 4 2 Sex QC 2 3 Age QC 3 A2 Do you have access to the internet at home? QC 5 B1 When did you last use a computer? QC QC QC QC 9 If (within last 3 months)B2 How often on average? C1 When did you last use the internet? If (within last 3 months)C2 How often on average? QC – Question Construct If C 1 If C 2 If C – If-ThenElse Construct Seq C 1 Seq C – Sequence Construct

Let us have a look at the POC Demo

What does the POC show?

It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system

What does the POC show? It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system Designing a survey questionnaire in DDI is fairly complex So is designing one by manually

What does the POC show? It is realistic to use the DDI metadata to design and present a survey questionnaire The POC depended entirely on the metadata o absolutely no knowledge about the survey built into the system Designing a survey questionnaire in DDI is fairly complex So is designing one by manually There are some clear advantages modification and reuse is easy multi-lingual presentation is easy

What does the POC show? Really need some form of Layout Metadata DDI Instrument and Control Construct metadata gives no guidance on layout POC case was very simple o but still needed some layout information o realistic survey questionnaire needs considerable layout information needs more thought to design this metadata

What does the POC show? POC used a Windows Form questionnaire no practical use for production survey but other questionnaire formats should be easy o hope to have paper example (in Word) by end of week o script for a Web Form questionnaire is straight-forward – but Web Form system still needs to process Control Constructs o script for CATI is easy o script for Blaise should be easy easy to have questionnaire available in multiple formats

Thank you Questions? Bryan Fitzpatrick Rapanea Consulting Limited Ph