Saturday, 11 June 2016 Project FoodCASE Workshop Data Quality Research on Food Composition Database Systems © Department of Computer Science | ETH Zürich Karl Presser
Saturday, 11 June 2016 Department of Computer Science 2 Agenda Project FoodCASE Demo of Software FoodCASE Interactive Questionnaire Single Value Aggregated Value Data Quality Recipe General
Saturday, 11 June 2016 Department of Computer Science Project FoodCASE FoodCASE is a research project (my Ph.D. project) on data quality of scientific database systems from an „IT-technical“ point of view FoodCASE =„Food Composition And System Environment“ Project contains a food composition database management system, called FoodCASE Project FoodCASE has 23 participants from 20 countries More information: 3
Saturday, 11 June 2016 Department of Computer Science Demo 4
Saturday, 11 June 2016 Department of Computer Science Interactive Questionnaire Fill in your position for data quality research purpose We discuss every question first Then everybody makes a decision Only one answer per (sub)question allowed Additional, not listed, answers are allowed 5
Saturday, 11 June 2016 Department of Computer Science Question 1 - Single Values Do we need to store information for more than one sample for a single value? 6 Single value Sample 1 Sample 2 Sample 3
Saturday, 11 June 2016 Department of Computer Science Question 2 - Single Values Which reference information do we want to store in FoodCASE? Original Reference Code KEY, Standard Reference Code, Acquisition Type, Reference Type, Citation (The above fields are what will be included in the interchange package) Title, Authors, Publication Date, Version, Original Language, ISBN, First Edition Date, Edition Number, Number of Pages, Book Title, Editors, ISBN, Pages, Long Journal Name, Abbreviated Journal Name, ISSN, Volume, Issue, Series Name, Series Number, Report Title, File Format, WWW, Publication Medium, Operating System, Primary Publication Media, Valid from, Remarks 7
Saturday, 11 June 2016 Department of Computer Science Question 3 - Single Values How many references can a single value have? 8 Single value Reference 1 Reference 2 Reference 3
Saturday, 11 June 2016 Department of Computer Science Question 4 - Single Values Do we need to store information for more than one method for a single value? 9 Single value Method 1 Method 2 Method 3
Saturday, 11 June 2016 Department of Computer Science Question 5 - Aggregated Values How do we round aggregated values? Example: > ?, > ?, > ? Significant digits: 2, 3, 4, 5, 6 E.g. 3 digits: > 10.0, > 9.8*10 -7, > 5.86 Relative error: 1%, 2%, 3%,4%, 5% E.g. 1%: > 10, > 9.8* 10 -7, > 5.9 Problem y*10 -7 : Convert to smaller units, if still is factorised, set to 0 10
Saturday, 11 June 2016 Department of Computer Science Question 6 - Aggregated Values How should versioning work? Kind of versioning: No versioning, 2-fold versioning (current and rest), archive Content of versioning: Only values, subset (which?), all Functionality I: Creating a new version -> copy preciding version or empty set Functionality II: Publish current version on the website/definable which version is published Functionality III: Older versions are editable to correct errors 11
Saturday, 11 June 2016 Department of Computer Science Question 7 - Aggregated Values How do we calculate the confidence code from the quality indexes? 12 Single value 1 Single value 2 Single value 3 Aggregated value weight 1 weight 2 weight 3 QI: 2.4 QI: 5.1 QI: 8.3 Cofidence Code: ? Confidence code = smallest quality index Confidence code = weighted sum of quality indexes Other possibilities?
Saturday, 11 June 2016 Department of Computer Science Question 8-13 – Data Quality Research Please fill in your answers for the questions 8 to 13 13
Saturday, 11 June 2016 Department of Computer Science Question 14 - Recipes How can a retention factor be determined? Recipe calculation report of Ana Lucia, Simone and Bernd: Retention factors are defined according to EuroFIR food classification and LanguaL codes Food groups can be country specific What about multiple cooking methods 14
Saturday, 11 June 2016 Department of Computer Science Question 15 - Recipes What should FoodCASE do if yield factors are missing? Can recipe calcuation be done without yield factors? Is there a default value that can be used? 15
Saturday, 11 June 2016 Department of Computer Science Question 16 - General What information do we need from the codex alimentarius? list.do?lang=en No information Link to the homepage Certain fields, which? 16
Saturday, 11 June 2016 Department of Computer Science Question 17 - General What information do we need from the E-Number and the INS-Code? itives/add_labelling_en.htm itives/add_labelling_en.htm No information INS-Code 17 What information should be deletable?
Saturday, 11 June 2016 Department of Computer Science Question 18 - General What information should be deletable? On single value level On aggregated level On recipe level 18
Saturday, 11 June 2016 Department of Computer Science Question 19 - General What happens with the data after the import through the web service? Nothing because web service is only for the e- search facility Imported data is read-only and will not be stored in FoodCASE FoodCASE should allow to store imported data 19
Saturday, 11 June 2016 Department of Computer Science Question 20 - General There is a trade-off between information density and clarity on a GUI. What do you prefer? 20
Saturday, 11 June 2016 Department of Computer Science Thank you very much for your participation 21