Presentation is loading. Please wait.

Presentation is loading. Please wait.

BCOR 1020 Business Statistics

Similar presentations


Presentation on theme: "BCOR 1020 Business Statistics"— Presentation transcript:

1 BCOR 1020 Business Statistics
Lecture 3 – January 24, 2008

2 Overview Chapter 3 – Describing Data Visually… Visual Description
Dot Plots Frequency Distribution and Histograms Simple Line Charts & Bar Charts Scatter Plots Tables Pie Charts Maps and Pictograms Deceptive Graphs

3 Chapter 3 – Visual Description
Methods of organizing, exploring and summarizing data include: Visual (charts and graphs) – provides insight into characteristics of a data set without using mathematics. Numerical (statistics or tables) – provides insight into characteristics of a data set using mathematics.

4 Chapter 3 – Visual Description
Beginning with univariate data (a set of n observations on one variable), consider the following: Measurement – What are the units of measurement? Are the data integer or continuous? Any missing observations? Any concerns with accuracy or sampling methods? Central Tendency – Where are the data values concentrated? What seem to be typical or middle data values? Dispersion – How much variation is there in the data? How spread out are the data values? Are there unusual values? Shape – Are the data values distributed symmetrically? Skewed? Sharply peaked? Flat? Bimodal?

5 Chapter 3 – Visual Description
Example: Price/Earnings Ratios: P/E ratios are current stock price divided by earnings per share in the last 12 months. For example:

6 Chapter 3 – Visual Description
Measurement – Look at the data and visualize how it was collected and measured. Sorting – Sort the data and then summarize in a graphical display. Here are the sorted P/E ratios: Sorting allows you to observe central tendency, dispersion and shape as well as minimum, maximum and range. 8 10 13 14 15 16 17 18 19 20 21 22 23 26 27 29 34 48 55 68

7 Chapter 3 – Dot Plots A dot plot is the simplest graphical display of n individual values of numerical data. Easy to understand. Not good for large samples (e.g., > 5,000). Steps in Making a Dot Plot: Make a scale that covers the data range Mark the axes and label them Plot each data value as a dot above the scale at its approximate location. If more than one data value lies at about the same axis location, the dots are piled up vertically. * Figure 3.4 in your text details the MegaStat menus for creating a dotplot.

8 Chapter 3 – Dot Plots Range of data shows dispersion.
Clustering shows central tendency. Dot plots do not tell much of shape of distribution. Can add annotations (text boxes) to call attention to specific features.

9 Chapter 3 – Frequency Distributions and Histograms
Bins and Bin Limits: A frequency distribution is a table formed by classifying n data values into k classes (bins). Bin limits define the values to be included in each bin. Widths must all be the same. Frequencies are the number of observations within each bin. Express as relative frequencies (frequency divided by the total) or percentages (relative frequency times 100).

10 Chapter 3 – Frequency Distributions and Histograms
Constructing a Frequency Distribution: Sort data in ascending order (e.g., P/E ratios) Choose the number of bins (k). k should be much smaller than n. Too many bins results in sparsely populated bins, too few and dissimilar data values are lumped together. Herbert Sturges proposes the following rule: k = 1 + log2(n) Sample Size (n) Number of Bins (k) 16 5 32 6 64 7 128 8 Sample Size (n) Number of Bins (k) 256 9 512 10 1024 11

11 Chapter 3 – Frequency Distributions and Histograms
Constructing a Frequency Distribution: Set the bin limits: Bin width  In our example, we will use k = 7 bins to get convenient bin limits. The approximate bin width is: Bin width  To obtain “nice” limits, we round the width to 10 and start the first bin at 0 to get bin limits: 0, 10, 20, 30, 40, 50, 60, 70

12 Chapter 3 – Frequency Distributions and Histograms
Constructing a Frequency Distribution: Put the data values in the appropriate bin. In general, the lower limit is included in the bin while the upper limit is excluded. Create the table, you can include: Frequencies – counts for each bin Relative frequencies – absolute frequency divided by total number of data values. Cumulative frequencies – accumulated relative frequency values as bin limits increase. Example: Back to the P/E ratio data…

13 Chapter 3 – Frequency Distributions and Histograms
What are the bin limits for the P/E ratio data? Bin Range Frequency Relative Frequency Cumulative Relative Frequency 0<P/E Ratio<10 1 0.0333 10<P/E Ratio<20 15 0.5000 0.5333 20<P/E Ratio<30 10 0.3333 0.8666 30<P/E Ratio<40 0.8999 40<P/E Ratio<50 0.9332 50<P/E Ratio<60 0.9665 60<P/E Ratio<70 0.9998

14 Chapter 3 – Frequency Distributions and Histograms
A histogram is a graphical representation of a frequency distribution. A histogram is a bar chart. X-axis ticks shows end points of each bin. Y-axis shows frequency (or relative/cumulative frequency) within each bin. Consider 3 histograms for the P/E ratio data with different bin widths. Do they give you different impressions of the data? k = 4 k = 7 k = 13 * Figures 3.8 & 3.9 in your text details the MegaStat menus for creating a histogram.

15 Chapter 3 – Frequency Distributions and Histograms
Modal Class – a histogram bar that is higher than those on either side: Monomodal – a single modal class. Bimodal – two modal classes. Multimodal – more than two modal classes. Caution: Modal classes may be artifacts of the way bin limits are chosen.

16 Chapter 3 – Frequency Distributions and Histograms
Shape: A histogram suggests the shape of the population. Skewness – indicated by the direction of the longer tail of the histogram. Left-skewed – (negatively skewed) a longer left tail. Right-skewed – (positively skewed) a longer right tail. Symmetric – both tail areas approximately the same. Some examples…

17 Chapter 3 – Frequency Distributions and Histograms

18 Clickers Consider the histogram of the P/E ratio data
that was displayed earlier in this lecture. How would you describe the skewness of this histogram? A = symmetric B = left-skewed C = right-skewed

19 Chapter 3 – Simple Line Charts
Simple Line Charts – Used to display a time series or spot trends, or to compare time periods. Can display several variables at once.

20 Chapter 3 – Simple Line Charts
Two-scale line chart –used to compare variables that differ in magnitude or are measured in different units. Grid Lines – A line graph usually has no vertical grid lines. Horizontal lines can be added to make it easier to establish the y value. Which is easier to read?

21 Chapter 3 – Simple Line Charts
Log Scales: Arithmetic scale – distances on the Y-axis are proportional to the magnitude of the variable being displayed. Logarithmic scale – (ratio scale) equal distances represent equal ratios. Use a log scale for the vertical axis when data vary over a wide range, say, by more than an order of magnitude. This will reveal more detail for small data values. Log scale is only suited for positive data values. Reveals whether the quantity is growing at an increasing percent (concave upward), constant percent (straight line), or declining percent (concave downward)

22 Example… Consider the following graphs illustrating U.S.
Trade from 1959 to What does the log scale graph tell you about growth rate for both series? Arithmetic scale Log scale

23 Chapter 3 – Simple Line Charts
When to Use Log Scales: Useful for… time series data that might be expected to grow at a compound annual percentage rate (e.g., GDP, national debt, future income) financial charts that cover long periods of time-data that grow rapidly (e.g., revenues)

24 Chapter 3 – Simple Line Charts
Tips for Effective Line Charts: Line charts are used for time series data (never for cross-sectional data). Y-axis shows numerical variable while X-axis shows time units with time increasing left to right. Use a zero origin on the Y-axis unless more detail is needed. Omit numerical labels on a line chart to avoid clutter. Use gridlines if needed. Use data markers (squares, triangles, circles) if they don’t clutter the graph. Don’t make lines too thick.

25 Chapter 3 – Bar Charts Plain Bar Charts – Most common way to
display attribute data. Bars represent categories or attributes. Lengths of bars represent frequencies.

26 Chapter 3 – Bar Charts Pareto Charts – Special type of bar chart used in quality management to display the frequency of defects or errors of different types. Categories are displayed in descending order of frequency. Focus on significant few (i.e., few categories that account for most defects or errors).

27 Chapter 3 – Bar Charts Stacked Bar Chart – Bar height is the sum of
several subtotals. Areas may be compared by color to show patterns in the subgroups and total.

28 Chapter 3 – Bar Charts Bar Charts for Time Series Data – Bar charts
can be (and often are) used for time series data although it may be harder to compare trends.

29 Chapter 3 – Bar Charts Tips for Effective Bar Charts:
Show the numerical variable of interest with vertical bars on the Y-axis, category labels on the X-axis. For time series quantities, display the category labels on the horizontal X-axis with time increasing from left to right. The height or length of each bar should be proportional to the quantity displayed. Put numerical values at the top of each bar, except if too cluttered.

30 Chapter 3 – Scatter Plots
Example: Aircraft Fuel Consumption: Consider five observations on flight time and fuel consumption for a twin-engine Piper Cheyenne aircraft. Trip Leg Flight Time (hours) Fuel Used (pounds) 1 2.3 145 2 4.2 258 3 3.6 219 4 4.7 276 5 4.9 283 A causal relationship is assumed since a longer flight would consume more fuel.

31 Chapter 3 – Scatter Plots
Example: Aircraft Fuel Consumption: Here is the scatter plot with flight time on the X-axis and fuel use on the Y-axis. Is there an association between variables? * Figure 3.31 in your text details the Excel menus for creating a scatter plot.

32 Chapter 3 – Scatter Plots
Degree of Association/Correlation: Very strong association Strong association Moderate association Little or no association

33 Clickers A = True B = False
Consider the scatter plot (below) comparing birthrates and life expectancies in several countries. True or False: This graph shows a strong association between these two variables. A = True B = False

34 Elementary and Secondary Colleges and Universities
Chapter 3 – Tables Tables are the simplest form of data display. A compound table is a table that contains time series data down the columns and variables across the rows. Example: School Expenditures Arrangement of data is in rows and columns to enhance meaning. The data can be viewed by focusing on the time pattern (down the columns) or by comparing the variables (across the rows). Units of measure are stated in the footnote. Note merged headings to group columns. Elementary and Secondary Colleges and Universities Year All Schools Total Public Private 1960 142.2 99.6 93.0 6.6 42.6 23.3 19.3 1970 317.3 200.2 188.6 11.6 117.2 75.2 41.9 1980 373.6 232.7 216.4 16.2 140.9 93.4 47.4 1990 526.1 318.5 293.4 25.1 207.6 132.9 74.7 2000 691.9 418.2 387.8 30.3 273.8 168.8 105.0 Source: U.S. Census Bureau, Statistical Abstract of the United States: 2002, p Note: All figures are in billions of constant 2000/2001 dollars.

35 Chapter 3 – Tables Tips for Effective Tables:
Keep the table simple, consistent with its purpose. Summary tables go in the main body. Detailed tables go in an appendix. In a slide show, main point of table should be clear within 10 seconds, otherwise, break up table. Display the data to be compared in columns. Round off data to 3 or 4 significant figures. Table layout should guide the eye towards the desired comparison. Use spaces or shading to separate rows or columns. Use lines sparingly. Keep row and column headings simple yet descriptive. Use a consistent number of decimal digits within a column. Right-justify or decimal align the data.

36 Chapter 3 – Pie Charts An Oft-Abused Chart:
A pie chart can only convey a general idea of the data. Pie charts should be used to portray data which sum to a total (e.g., percent market shares). If frequency counts are important, use a bar chart or histogram. A pie chart should only have a few (i.e., 2 or 3) slices. Each slice should be labeled with data values or percents.

37 Chapter 3 – Maps and Pictograms
Spatial Variation and GIS: Maps can be used for displaying many kinds of data. Appropriate when patterns of variation across space are of interest. Self-explanatory and revealing. Assess patterns based on geography. GIS (geographic information systems) combines statistics, geography and graphics.

38 Chapter 3 – Maps and Pictograms
Example: U.S. population change by county, 1990/2000

39 Chapter 3 – Maps and Pictograms
Example: U.S. presidential election results, 2004 On election night 2004 and in the months and years since then, we have seen many maps that look like this. The amount of red on the map is skewed because there are a lot of large states (geographically) in which a majority voted Republican. One possible way to allow for this, suggested by Robert Vanderbei at Princeton University, is to use not just two colors on the map, red and blue, but instead to use red, blue, and shades of purple to indicate percentages of voters. Here is what the normal map looks like if you do this. Source:

40 Chapter 3 – Maps and Pictograms
Example: U.S. presidential election results, 2004 We can also correct for this by making use of a cartogram, a map in which the sizes of states have been rescaled according to their population. That is, states are drawn with a size proportional not to their sheer topographic acreage -- which has little to do with politics -- but to the number of their inhabitants, states with more people appearing larger than states with fewer, regardless of their actual area on the ground. Source:

41 Chapter 3 – Maps and Pictograms
A visual display in which data values are replaced by pictures. Although entertaining, they can create visual distortion. What do you think?

42 Chapter 3 – Deceptive Graphs
Error 1: Nonzero Origin A nonzero origin will exaggerate the trend.

43 Chapter 3 – Deceptive Graphs
Error 2: Elastic Graph Proportions Keep the aspect ratio (width/height) below 2.00 so as not to exaggerate the graph. By default, Excel uses an aspect ratio of 1.8.

44 Chapter 3 – Deceptive Graphs
Error 3: Dramatic Title Keep short and grab readers attention. Error 4: Distracting Pictures Avoid so as not to distract readers or impart an emotional slant. Error 5: Authority Figures Can use pictures of authority figures to impart credibility to self-serving commercial claims.

45 Chapter 3 – Deceptive Graphs
Error 6: 3-D and Rotated Graphs Can make trends appear to dwindle into the distance or loom towards you. Deceptive Correct

46 Chapter 3 – Deceptive Graphs
Error 7: Missing Axis Demarcations If tick marks are missing, you cannot identify individual data values. Error 8: Missing Measurement Units or Definitions Missing or unclear units of measurement can render a chart useless. Error 9: Vague Source May indicate lost citation, unknown source, or mixed data sources. Use complete source citations.

47 Chapter 3 – Deceptive Graphs
Error 10: Complex Graphs Avoid if possible. Keep your main objective in mind. If necessary, break graph into smaller parts.

48 Chapter 3 – Deceptive Graphs
Error 11: Gratuitous Effects Avoid too many annoying special effects when using slide shows. Error 12: Estimated Data Estimated points should be noted when used or avoided if possible.

49 Chapter 3 – Deceptive Graphs
Error 13: Area Trick As figure height increases, so does width, distorting the area.

50 Clickers Consider the graph given below. What error is
present that makes this a deceptive graph? A = Non-Zero Origin B = Dramatic Title C = 3-D or Rotated D = Complex Graph

51 Chapter 3 – Appendix: Effective Excel Charts

52 Use the mouse to select (highlight) the data you want to plot.

53 Effective Excel Charts
Chart Wizard Click on the Chart Wizard icon on the toolbar to open a sequence of pop-up menus to guide you through the steps of creating a chart. Step 1: Select the Chart type and then click Next.

54 Effective Excel Charts
Chart Wizard Step 2: Add labels for years on the X-axis by selecting a data range (B4:B13). Click Next.

55 Effective Excel Charts
Chart Wizard Step 3: Embellish the chart by adding a title, axis labels, adjusting the gridlines or appending a data table to the graph by clicking on the appropriate tab.

56 Effective Excel Charts
Chart Wizard Step 4: Click Next to display the finished chart.

57 Effective Excel Charts
Embellished Charts Charts created in Excel can be edited to: - Improve the titles (main, X-axis, Y-axis). - Change the axis scales (minimum, maximum, demarcations). - Display the data values (on the top of each bar).

58 Effective Excel Charts
Embellished Charts Charts created in Excel can be edited to: - Add a data table underneath the graph.

59 Effective Excel Charts
Embellished Charts Charts created in Excel can be edited to: - Change color or patterns in the plot or chart areas.

60 Effective Excel Charts
Embellished Charts Charts created in Excel can be edited to: - Format the decimals (on the axes or data labels). - Edit the gridlines (color, dotted or solid, patterns).

61 Effective Excel Charts
Embellished Charts Charts created in Excel can be edited to: - Alter the appearance of the bars (color, pattern, gap width).

62 Effective Excel Charts
Embellished Charts To alter a chart’s appearance, click on any chart object and then right-click to see a menu of properties that you can change. For example, right-click on the Y-axis scale and choose Format Axis.

63 Effective Excel Charts
Embellished Charts Be careful about over-embellishing your charts. Embellished bar chart Over-embellished chart

64 Effective Excel Charts
Embellished Charts Excel offers many other types of specialized charts. Area (mountain) chart Multiple bar chart

65 Effective Excel Charts
Embellished Charts Other specialized Excel charts: - Bubble chart displays three variables on a 2-dimensional scatter plot. - Note: bubble size is proportional to third variable. Data from

66 Effective Excel Charts
Embellished Charts Other specialized Excel charts: - Stock chart for high/low /close stock prices. Data from

67 Effective Excel Charts
Embellished Charts Other specialized Excel charts: - Radar (or Spider) chart compares individual performance against a benchmark. - Caution, data may be distorted by emphasized areas.

68 Effective Excel Charts
Embellished Charts Other specialized Excel charts: - Use floating bar charts to show a range of data.


Download ppt "BCOR 1020 Business Statistics"

Similar presentations


Ads by Google