Lecturer Dr. Veronika Alhanaqtah STATISTICS Lecturer Dr. Veronika Alhanaqtah
Topic 1 Introduction to data
1. Introductory concepts and vocabulary Four basic concepts: Data set Unit of observation Variable Variable types
Data Set Data is typically organized in a data table or data matrix (data set), that is made up of rows and columns. Rows in a data set are the units on which information is gathered. Columns are the variables or the pieces of information that are gathered about each unit.
Example: Dataset on movies Name Genre Budget Studio Audience 50/50 C 8 Ind 93 Warrior A 25 LG Harry Potter F 125 WB The help D DW 91 Money ball 50 Col 89 Legend: C – Comedy, A - A ction , F – Fantasy, D – Documentary Source: www.informationisbeautiful.net
Unit of observation 50/50 C 8 Ind 93 Warrior A 25 LG Harry Potter F Name Genre Budget Studio Audience 50/50 C 8 Ind 93 Warrior A 25 LG Harry Potter F 125 WB The help D DW 91 Money ball 50 Col 89 The movie is our unit of analysis (unit of observation)
Variable Name Genre Budget Studio Audience 50/50 C 8 Ind 93 Warrior A 25 LG Harry Potter F 125 WB The help D DW 91 Money ball 50 Col 89 Each of the columns is a variable or a characteristic that varies from movie to movie.
Variable Type Name Genre Budget Studio Audience 50/50 C 8 Ind 93 Warrior A 25 LG Harry Potter F 125 WB The help D DW 91 Money ball 50 Col 89 Categorical variables (qualitative, nominal) don't have numeric values. Numeric variables (quantitative) are which we could put on a number line and do mathematical operations for.
Data And Story online Library DASL is an online library on data files and stories that illustrate the use of basic statistic methods (real-world examples). http://lib.stat.cmu.edu Data files are data sets. Stories are abstracts that discuss the statistical concept of a particular data file.
Information Is Beautiful online library Information is beautiful is a web-site of the British data journalist and information designer David McCandless. The team of this data storage is dedicated to distilling the world’s data, information and knowledge into beautiful, interesting and, above all, useful visualizations, infographics and diagrams. http://www.informationisbeautiful.net
Homework Visit instructor’s website: www.alveronika.wordpress.com Statistics Open DASL (online library on data files): www.lib.stat.cmu.edu Choose “DASL” Look through data sets and find a couple that you enjoy. Open up the data and look what is: data set, unit of observation, variables and variable types (categorical or numeric).