Describing and Organizing Data PSYSTA1 – Weeks 3-4
RAW Data the data set in its original form, wherein the recording process is done in the sequence in which they are collected
PRESENTING and SUMMARIZING Data Tables Graphs Numerical Measures oftentimes accompanied by textual presentations
TABLE systematic organization of data in rows and columns
TABLE GUIDELINES: The title should be concise and not in complete sentence. Column labels should be precise. The units of measurements must be clearly stated. Show any relevant total, subtotals, percentages. Indicate if data were taken from another publication by including a source note. Tables should be self-explanatory. It may be accompanied by text.
FREQUENCY DISTRIBUTION presents the score values and their frequency of occurrence an organized presentation of the number of individuals located in each category on the scale of measurement presents a picture of how the individual scores are distributed on the measurement scale may be structured either as a table or as a graph In any case, the following elements are presented: the set of categories that make up the original measurement scale a record of the frequency, or number of individuals in each category
FREQUENCY DISTRIBUTION Table Two (2) Types: Single Value Grouping usually for qualitative or (limited) discrete quantitative data
Some DEFINITIONS Relative Frequency Percentage indicates the proportion of the total number of items that occurs in each interval category = frequency total number of observations Percentage indicates the relative frequency presented in parts per 100 =relative frequency×100%
Example 1 Fifty first year college students were asked about their favorite electronic gadget: iPad, PlayStation Portable or PSP, iPhone, camera, iPod. Below is a list of their responses. Construct a frequency distribution table for this data set.
(2-way) CONTINGENCY Table a statistical table that shows the observed frequencies or proportions of data elements classified according to two variables, with the rows indicating one variable and the columns indicating the other variable a table showing the contingency between two variables where the variables have been classified into mutually exclusive categories and the cell entries are frequencies
(2-way) CONTINGENCY Table
FREQUENCY DISTRIBUTION Table Two (2) Types: Grouping by Class Interval usually for quantitative data
Some DEFINITIONS Class Mark/Midpoint Cumulative Frequency midpoint (average) of the upper and the lower real limits (boundaries) of each interval 𝑪𝑴= 𝐥𝐨𝐰𝐞𝐫 𝐥𝐢𝐦𝐢𝐭+𝐮𝐩𝐩𝐞𝐫 𝐥𝐢𝐦𝐢𝐭 𝟐 Cumulative Frequency indicates the number of scores that fall below the upper real limit (boundary) of each interval Cumulative Percentage indicates the percentage of scores that fall below the upper real limit of each interval
FREQUENCY DISTRIBUTION Table STEPS: (Grouping by Class Interval) Find the range (max – min) of the scores. Determine the width of each class interval w. 𝒘= 𝐫𝐚𝐧𝐠𝐞 # 𝐨𝐟 𝐜𝐥𝐚𝐬𝐬𝐞𝐬 List the limits of each class interval, placing the interval containing the lowest score value at the bottom. Tally the raw scores into the appropriate class intervals. Add the tallies for each interval to obtain the interval frequency.
Example 2 Given the following 90 scores, construct a frequency distribution of grouped scores having approximately 12 intervals.
EXERCISE The psychology department of a large university maintains its own vivarium of rats for research purposes. A recent sampling of 40 rats from the vivarium revealed the following rat weights (grams): 320 282 341 324 340 302 336 265 310 335 353 318 296 309 308 310 314 298 315 360 275 315 297 330 250 274 318 287 284 267 292 348 270 263 269 292 298 343 284 352 Construct a frequency distribution of grouped scores with approximately 10 intervals.
GRAPH a device for showing numerical values or relationships in pictorial form
GRAPH Advantages: main features and implications of a data can be easily grasped can attract attention and hold the reader’s interest simplifies concepts that would otherwise have been expressed in so many words can readily clarify data, frequently bring out hidden facts and relationships
Qualities of a GOOD Graph Accurate should not be deceptive, distorted, or misleading Simple should be straightforward, not loaded with irrelevant or trivial symbols and ornamentation Clear should be easily read and understood Appearance to attract and hold attention
LINE Graph useful for showing trends over a period of time
PIE Chart a circle is divided into sectors in such a way that the area of each sector is proportional to the size of the quantity represented by that sector
PICTOGRAPH pictures or symbols are used to represent certain quantity or volume
MAP Chart displays data by shading sections of a map, and must include a key; a total data number should be included
DOT Plot a plot of points along a single axis (usually the horizontal axis) which represents a countable scale and when data values repeat, the dots are placed above one another, forming a pile at that particular numerical location
BAR Graph consists of a series of rectangular bars where the length of the bar represents the magnitude the bars for each category do not touch each other, emphasizing the lack of a continuous relationship between the categories
Example 3 Construct a bar graph for the “electronic gadget data” presented in Example 1.
HISTOGRAM class limits (boundaries) are represented by the width of the bars and the frequencies that fall within the classes are represented by the height of the bars bars are drawn adjacent with one another to emphasize continuity of the underlying measurement
FREQUENCY Polygon points plotted over the midpoint of each interval at a height corresponding to the frequency of the interval are joined with straight lines, while the line joining the points is extended to meet the horizontal axis at the midpoint of the two class intervals falling immediately beyond the end class intervals containing scores (hence, a polygon)
OGIVE (CUMULATIVE Curve) a graph showing the curve of a cumulative distribution function (either frequencies or percentages) wherein the points plotted are the upper class limit and the corresponding cumulative distribution
Example 4 Construct a histogram, a frequency polygon, and an ogive for the given data in Example 2. Be guided by the constructed FDT.
STEM-and-LEAF Display a device for presenting quantitative data in a graphical format by splitting each data value a “leaf” and a “stem”
STEM-and-LEAF Display STEPS: Divide each measurement into two parts: the stem and the leaf. List the stems in a column, with a vertical line to their right. For each of the measurement, record the leaf portion in the same row as its corresponding stem. Order the leaves from lowest to highest in each stem. Provide the key to your stem and leaf coding so that the reader can recreate the actual measurements if necessary.
Example 4 Construct a stem-and-leaf display (with stem unit = 10 and leaf unite = 1) for the data given in Example 2.
Some Notes on STEM-and-LEAF Display first developed in 1977 by John Tukey, working at Princeton University a simple alternative to the histogram and are most useful for summarizing and describing data when the data set is small it does not lose any of the original data (unlike the histogram) rotating the stem-and-leaf display 90° counterclockwise, such that the stems are at the bottom, results in a diagram very similar to the histogram
EXERCISE The psychology department of a large university maintains its own vivarium of rats for research purposes. A recent sampling of 40 rats from the vivarium revealed the following rat weights (grams): 320 282 341 324 340 302 336 265 310 335 353 318 296 309 308 310 314 298 315 360 275 315 297 330 250 274 318 287 284 267 292 348 270 263 269 292 298 343 284 352 Construct a histogram, a frequency polygon, and an ogive for the given data. (*Superimpose all three in one graphing plane.)