Presentation is loading. Please wait.

Presentation is loading. Please wait.

Infotools Harmoni IHU: Data Sources

Similar presentations


Presentation on theme: "Infotools Harmoni IHU: Data Sources"— Presentation transcript:

1 Infotools Harmoni IHU: Data Sources
August 2017 Presented by: Janine Takle

2 Data Sources A data source is the starting point for all you do in Infotools Harmoni A source is a collection of data that is either in an imported file, or in a connected data store Each record in a source corresponds to a count in Infotools Harmoni Infotools Harmoni itself is source agnostic

3 Sources vision Our vision is for any appropriate data source to be analysed, have insights visualised, then distributed, all within Infotools Harmoni Connections that facilitate real time updates are expected to be the way of the future We have a continuous development stream dedicated to adding connections and supporting new sources We are strongly influenced by the needs of our clients when identifying additional sources to support

4 Source Types

5 Imported files Imported files capture data for a moment in time
Files that can be imported into Infotools Harmoni include: SAV files X-files MDD/DDF – Dimensions files XLSX – Excel files TXT files – tab delimited text CSV files – comma delimited text

6 Direct connections Direct connections allow updated data to flow into Infotools Harmoni in real time Direct connections are achieved using APIs (Application Programming Interface). If an API is available in a client’s data collection system we can potentially develop a direct connection We currently support direct connections to: Tables in SQL Server Views in SQL Server Projects in Decipher

7 Warning, from here on in we dive into the detail….

8 Source Structures

9 Source content Sources can be divided into 2 types – record level and summary level Record level sources are sources where analysing the count of records is relevant, e.g. where a record represents a respondent, a log entry, a sales transaction, etc. Summary level sources are sources where analysing the count of records is irrelevant, and analysis only has meaning when a measure is applied, e.g. scores, GRPs, Spend, etc. Files in SAV and Dimensions format typically contain record level data with a record representing a respondent

10 Creating sources SAV files can be generated from a variety of data collection applications, but not all make full use of available features X-files is Infotools’ proprietary format read by our legacy Desktop applications Dimensions files are typically created by Dimensions, although some other market research data collection applications can also create them Xlsx files are primarily created from Excel in Office 2007+ Text files, delimited with a tab or comma, can be generated from a variety of systems, and do not have market research specific structures

11 SAV Structure Coded responses have labels
Looped questions have a separate variable for each loop combination All respondents asked the question have a response Respondents not asked the question don’t have any response Multiple response sets capture yes responses Cleaned data, with only valid respondents

12 X-File Structure X-files are typically already designed
Vaxes must be in indexless format (achieved using SUP -Clean) Each item within a common type needs a unique label Headers need to be new style, i.e. version 4+ Matrices, yngrids and vaxes with predefines are not supported X-files more likely for internal consumption, or customers with existing data services

13 Before continuing with the next 6 slides of detail, let’s just come up for air and be inspired by Albert Einstein for a minute… While we have some conditions around the structure of the data that can be analysed in IH, the content of the data is limited only by your imagination

14 Now let’s dive back down into the detail….

15 Dimensions MDD/DDF Structure
The Dimensions converter is currently undergoing an overhaul Supported structures will be confirmed within the next few months

16 XLSX Structure First row contains headers with unique, non-blank descriptions Fields contain labels, not codes. If codes are used they will need to be renamed to labels after importing into Infotools Harmoni (IH) Key identifiers can be added to the header labels to ensure required Infotools Harmoni types are applied No blank rows between valid rows of data Numeric values don’t contain text, e.g. $, commas, % No fields contain |

17 TXT Structure First row contains headers with unique, non-blank, descriptions that don’t include tabs Fields are delimited with a tab Records contain labels, not codes. If codes are used they will need to be renamed to labels after importing into IH Key identifiers can be added to the header labels to ensure required IH types are applied No blank rows between valid rows of data Text containing tabs is enclosed in double-quotes Time/Date fields contain descriptions, not numbers Numeric values don’t contain text, e.g. $, tabs, %

18 CSV Structure First row contains headers with unique, non-blank, descriptions that don’t include commas Fields are delimited with a comma Records contain labels, not codes (if codes are used they will need to be renamed to labels after importing into IH) Key identifiers can be added to the header labels to ensure required IH types are applied No blank rows between valid rows of data Text containing commas is enclosed in double-quotes Time/Date fields contain descriptions, not numbers Numeric values don’t contain text, e.g. $, commas, %

19 SQL Structure All relevant data exists in a single view or table
All data associated with a unique record is reported in a single table or view Key identifiers can be added to field names to advise IH on the default type to import Fields contain labels, not codes (if codes are used they will need to be renamed to labels after importing into IH) Keep in mind that field names in SQL are limited to 128 characters, and cannot contain |

20 Direct Connection Structure
Each Direct Connection is developed to map the data types in the data collection system with the data types available in Infotools Harmoni Direct Connections allow market research specific structures, such as grids and multiple responses, to retain their internal relationships

21 Source Dictionaries

22

23 Source Dictionaries A Dictionary contains meta-data to guide interpretation When loading sources with a dictionary, Infotools Harmoni automatically maps items into Infotools Harmoni item types When loading sources that d on’t contain a dictionary, the user is given the option to override the automated mapping of source variables into Infotools Harmoni types Sources with inherent dictionaries are SAV, X-files, MDD/DDF and Decipher

24 SAV Dictionary Variables that are a Numeric Type, and have defined Value Labels become standard axes in IH, e.g. Variables that are a Numeric or Date Type, and have no defined Value Labels become measures in IH, e.g. Variables that are a String Type, and don’t have defined Value Labels become text items in IH, e.g.

25 Dictionary Tricks If the file doesn’t have a dictionary, key identifiers can be used to pre-determine the data type in text file imports The key identifiers are: $ any field starting with $ becomes a measure in IH $weight any field starting with $weight becomes a weight in IH & any field starting with & becomes a text item in IH

26 Tips & Tricks

27

28 Source Tips & Tricks You can see the sources in a Project that’s not loaded by selecting the project and clicking Sources You can see the sources that exist in a loaded Project by clicking Once in the view/add sources menu, you can add more sources to a project by clicking

29 Source Tips & Tricks Projects can contain multiple sources - harmonization of sources is, of course, one of our key selling points! Infotools Harmoni is source agnostic, so sources in a project can be a mix of imported files and direct connections Sources are specific to a Project

30 Update Tips & Tricks If a source is updated, or a new source is added, dictionary mappings applied to the previous source are applied to the new one When adding new sources, items that don’t already exist in the project are added to the end If an item does not appear with the expected label or type, Infotools Harmoni provides functionality to change these things within your Project … although sometimes it’s easier to change the source!

31 Filetype Tips & Tricks When using an x-files source you need to upload: Xbf Xdf Xef Xlk (optional) Excel files must be .xlsx Infotools Harmoni can use sources in any language, including languages requiring multi-byte storage, e.g. traditional Chinese

32 Tips & Tricks When importing files that do not have an associated dictionary, the elements of standard axes are written into the axis in the order that they appear within the file Multiple response sets in SAV files appear at the bottom of the project Data imported into SQL from Excel, Access or Text, has field names truncated at 64 characters

33 Helpful resources Infotools Harmoni Knowledge Bank Sample files
Sample files Infotools Harmoni – client sites Infotools Harmoni – internal sites

34 Try it yourself

35

36 Try it yourself…? Sample files can be found in us/articles/ Try importing each of the different sources and see if you can answer the following questions

37 Can you…? See what’s new in the 201502 SAV file that wasn’t in 201501?
Work out why the sample size is so high in the x-files? Import Experience feedback in the XLSX to be a verbatim in Infotools Harmoni Add the TXT file into the same project you imported the SAV files and watch the magic happen Take a copy of the CSV and change the source so Num Bevs Per Week is recognised as a measure in the import screen?

38 What is this data…? The following slides show some screenshots of sources. For each data source, consider the following questions: Is this data record level or summary level? Is this data suitably formatted to import into Infotools Harmoni? If not, why not?

39 What is this data…Source 1

40 What is this data…Source 2

41 What is this data…Source 3

42 What is this data…Source 4

43 What is this data…Source 5

44 Answers/Hints What is this data Try it yourself
Source 5 – summary level, suitably formatted Source 4 – record level, suitably formatted Source 3 – summary level, suitably formatted Source 2 – summary level, not suitably formatted – there are blank lines, merged cells in the headers, headings among the entries in column A, etc. Source 1 – record level, suitably formatted What is this data Hint: Insert a $ into the start of the Num Bevs Per Week header Hint: the colouring indicates the common and non-common items Hint: Insert a & into the start of the Experience feedback header Hint: a population projection has been set in the xlk Hint: use the NEW filter at the top of the project view Try it yourself

45 Questions

46


Download ppt "Infotools Harmoni IHU: Data Sources"

Similar presentations


Ads by Google