Dot Plot is a graphical summaries of data. A horizontal axis shows the range of values for the observations. Each data value is represented by a dot placed above the axis. Dot Plots show the details of the data and are useful for comparing the distribution of data for 2 or more variables. DOT DIAGRAM (DOT PLOT)
The data shown below are the number of case files for 30 “law and consulting firms” in Ankara in Construct a dot plot of the given above data.
THE STEM_AND_LEAF DISPLAY This technique can be used to show both the rank order and a shape of a data set simultaneously. To develop a stem_and_leaf display, we first arrange the leading digits of each data value to the left of the vertical line. To the right of the vertical line, we record the last digit for each data value as we pass through the observations in the order they were recorded. The numbers to the left of the vertical line form the STEM. Each digit to the right of the vertical line is a LEAF Advantages: 1- It is easier to construct by hand 2- Since it shows the actual data, this display provides more information than histograms
FREQUENCY DISTRIBUTION FOR QUALITATIVE DATA
FREQUENCY DISTRIBUTION FOR QUANTITATIVE DATA With quantitative data, we must be more careful in defining the non-overlapping classes to be used in the freq. distribution. There are three steps necessary to define classes for a freq. diApp. with quantitative data. 1- Determine the number of non-overlapping classes. 2- Determine the width of each class. 3- Determine the class limits 1- NUMBER OF CLASSES: Classes are formed by specifying ranges that will be used to group the data. As a general guideline, we recommend using between 5 and 20 classes. For a small number of data items, as few as five or six classes may be used to summarize the data. Sample SizeNumber of Classes Fewer than 505 – 6 Classes 50 to 1007 – 8 Classes Over 1009 – 10 Classes
3- CLASS LIMITS: Class limits must be chosen so that each item belongs to one and only one class. The lower class limit identifies the smallest possible data value assigned to the class. The upper class limit identifies the largest possible data value assigned to the class.
CLASS MIDPOINT (M): The class midpoint is the value halfway between the lower and upper class limits. The definitions of the Relative Freq. and Percent Freq. Distributions are as the same as for qualitative data Cumulative Distributions: Cumulative Distribution is another tabular summary of data (quantitative). Cumulative Distribution use the number of classes, class widths and class limits developed for the frequency distributions. Cumulative Distribution shows the number of data items with values “less than or equal to the upper class limit” of each class. We also note that a cumulative relative frequency distribution shows the proportion of data items, and a cumulative percent frequency distribution shows the percentage of data items with values less than or equal to the upper limit of each class.
In the year 2000, the average salaries of elementary school teachers in Oregon, Washington and Alaska were (in thousands USD) 40.9, 41.1 and Given that there were 19.7, 28.1 and 5.3 thousand teachers in these states. Find the average salary of all the elementary school teachers in the three states.
A wholesaler sold 575, 410 and 520 microwave ovens at prices (in USD) 75, 125 and 100 respectively. What is the mean price of the ovens sold?
If somebody invests TL 1,000 at 7%, TL 3,000 at 7.5% and TL 16,000 at 8%. What is the overall percentage yield of these investments?
The number of employees of the municipalities in a certain area of a country were given in the following table Find the mean and modal class? #Frequency
Eighty randomly selected light bulbs were tested to determine their lifetimes (in hours). This frequency distribution was obtained. Find the mean and modal class? Class BoundariesFrequency
THE MODE - If a data set has only one value that occurs with greatest frequency is said to be UNIMODAL -If a data set has two values that occur with the same greatest frequency is said to be BIMODAL -If a data set has more than two values that occur with the same greatest frequency, each value is used as the mode, and the data set is said to be MULTIMODAL -When no data value occurs more than once, the data set is said to have NO MODE
Example: The following data represent the duration (in days) of Space Shuttle voyages for the years (18 values) 8,9,9,14,8,8,10,7,6,9,7,8,10,14,11,8,14,11 Q: Find The Mode
MONTHLY STARTING SALARY (In TRL) GraduateMonthly Starting Salary 12,850 22,950 33,050 42,880 52,755 62,710 72,890 83,130 92, , , ,880 TOTAL: 35,280
A customer in a supermarket selected 6 cartons of eggs (each containing a dozen) from a large display. The egg-filled cartons weighed 25.9, 27.8, 25.8, 26.1, 23.5, and 45.4 ounces respectively. a-) Find the mean weight of these cartons. b-) Find the median weight of these cartons. c-) Is the mean a good average in this exercise?
Five lightbulbs burned out after lasting, respectively, for 867, 849, 840, 852, and 822 hours of continuous use. Find the mean and also determine what the mean would have been if the second value had been recorded incorrectly as 489 instead of 849.
Properties of the Mean 1- The Mean can be calculated for any set of NUMERICAL Data 2- The Mean is unique and unambiguous value. 3- The means of several sets of data can always be combined into the overall mean of all the data. 4- If each value in a sample were replaced by the mean, then ∑X would remain unchanged 5- The mean takes into account the value of each item in a set of data. 6- The mean is relatively reliable in the sense that means of many samples drawn from the same population generally do not fluctuate, or vary, as widely as other statistics used to estimate the mean of a population.
Properties of the Median In addition to the properties of the Mean; 1- It splits the data into two parts 2- The median is preferable to the mean because it is not so easily affected by extreme values.
Problem: The following list gives the duration in minutes of 24 power failures Find the median.