Presentation is loading. Please wait.

Presentation is loading. Please wait.

Organizing & Reporting Data: An Intro Statistical analysis works with data sets  A collection of data values on some variables recorded on a number cases.

Similar presentations


Presentation on theme: "Organizing & Reporting Data: An Intro Statistical analysis works with data sets  A collection of data values on some variables recorded on a number cases."— Presentation transcript:

1 Organizing & Reporting Data: An Intro Statistical analysis works with data sets  A collection of data values on some variables recorded on a number cases (records)  For example, the student data from last week:

2 Organizing & Reporting Data (cont.): Structure of most data sets = “rectangular  Columns = Variables  Rows = Cases  Cells = individual values

3 Managing Data: Basic Tasks NOTE: Reliance on Codebook for Data Set – Specify information about variables in the data set – Indicate Variable Names & Labels – Indicate Variable Values (codes) & Value Labels – Indicates “missing values” Can Modify Overall Arrangement of Data Set –Sorting  Change the order of the cases in the file –Selecting  identify a subset of cases to work on –Transforming  modify the values of a variable

4 Organizing & Reporting Data (cont.): Where do the data values come from? a)Raw Data: recorded from responses, record, or observations – In their (more-or-less) original form – Some coding (or editing) operations usually involved – Usually coded into numerical values (for ease of use) b)Transformed Data: modified from original values – Computed values (e.g., rates, %, sums, “imputations”) – Recoded values (into more correct or meaningful or useful values) c)Created Data: values are “made up” – Simulated values – Demonstration values

5 Managing Data: Basic Tasks Transforming Data: Variable Transformations a) Computing new variables from prior ones Index = Q1 + Q2 + Q3 + Q4 Utility = probability * outcome b) Recode Variable by changing its values Change missing values (“blanks”) to “0” c) Recode Variable into a New Variable Age (yrs)  Child (1-11); Juvenile (12-17); Adult (18- over) Age (yrs)  10-19 yrs; 20-29 yrs; 30-39 yrs; 40-49 yrs; 50-59 yrs; 60-69 yrs; 70-79 yrs; 80-89 yrs; 90-99 yrs.

6 Computed Data: Some Useful forms Rates – numbers divided by populations Ratios – one number divided by another Indexes – new variable = a sum (or other combination) of multiple prior variables Rescaled Data – a raw score modified by some mathematical function (e.g., logarithm) Standardized scores – Rescaled to standard units  e.g., Z-scores

7 Recoded Data: Some Useful forms Collapsed (& abbreviated) scores Grouped scores – recoding a numeric variable into a discrete (numeric or ordinal) variable –Uniform (or fixed-width) groupings  widths of groups are all the same [Note the standard rules for forming grouped variables] –Non-uniform (variable or flexible) groupings  widths of groups are not all the same –Normed groupings  grouped by proportions of cases  e.g., percentiles, quartiles, median-splits [a special form of non-uniform grouping ]

8 How to recode variables in SPSS? Use the Transform option on the top menu bar to change the data ( see Appendix B in Kirkpatrick/Feeney for details) Compute  allows for computing a new variable from prior variables Recode  allows for modifying how a variable is coded a)‘Into same variables’ (change original variable) b)‘Into different variables’ (create new variable with different codes & leave original variable as is)

9 Representing Data Distributions: In statistics, we are working with a collection of many data points  Our focus is on the distribution of the whole set of points Three forms of presentation for summarizing distributions of data points: 1.Tabular  tables and lists of numbers 2.Graphical  pictures, shapes, and lines (in charts, graphs, and diagrams) 3.Verbal  words and phrases

10 Tabular Presentations: Basic Formats 1)Data Listing: s imple inventory of points in the data set 2)Ordered Data Listing: I nventory of data sorted into groups or arranged in increasing or decreasing order 3)Frequency Table: summary showing each value and the number of cases having that value (most relevant for discrete variables) 4)Percentage Table : table with percentages of total cases given rather than (or in addition to) numerical counts 5)Cumulative Percentage Table: reporting percentages of total cases which have that specific value or lower. 6)Cross-Tab Table: a “bivariate” frequency distribution of the values of one variable across the values of another variable

11 Cross-Tabulations (cont.) What are the parts of a cross-tab? a)Cells b)Rows and columns c)Marginals d)Grand total How to set up a cross-tab? a)Which variables are in the rows and columns? b)Use Percentages or Frequencies? c)How to percentage a cross-tab?

12 Representing Distributions Graphically: Basic Formats Pie Charts Bar Charts –Vertical or Horizontal –Simple or Grouped –Stacked Histograms Line Charts –Frequency polygons –Time (Trend) plots –Relationship plots

13 Representing Distributions Graphically: Basic Formats Other Charts ( to be dealt with later): –Box Plots (aka “Box-and-Whiskers”) –Scatter Plots


Download ppt "Organizing & Reporting Data: An Intro Statistical analysis works with data sets  A collection of data values on some variables recorded on a number cases."

Similar presentations


Ads by Google