Presentation is loading. Please wait.

Presentation is loading. Please wait.

SESSION 11 & 12 Last Update 3 rd March 2011 Introduction to Statistics.

Similar presentations


Presentation on theme: "SESSION 11 & 12 Last Update 3 rd March 2011 Introduction to Statistics."— Presentation transcript:

1 SESSION 11 & 12 Last Update 3 rd March 2011 Introduction to Statistics

2 Lecturer:Florian Boehlandt University:University of Stellenbosch Business School Domain:http://www.hedge-fund- analysis.net/pages/vega.php

3 Learning Objectives 1.(Cumulative Relative) Frequency tables revisited… 2.Catalogue of graphical representations at your disposal 3.Polygons and Ogives – Differentiation Use this presentation as a guide. The contents are all relevant to your examination unless specified to the contrary!

4 Raw Data 1.Determine number of class intervals Sample size n = 25 Sturges’ formula: 2.Find maximum and minimum obs: ObsInvestment A 1-4.4 25.8 310.4 41.1 5-5.3 60.1 711.9 89.5 922.5 10-2.3 11-4.7 12-6.8 132.5 141.4 155.5 167.3 174.9 1813 19-2.2 2016.3 215.8 2215.4 236.2 242.7 2513.1 # of intervals= 1 + 1.4 * LN(n) # of intervals= 1 + 1.4 * LN(25) # of intervals= 5.506 ≈ 6 (Round up to nearest integer Maximum= 22.5 Minimum= -6.8

5 Raw Data 3.Calculate class width 4.Determine the next lower integer value from the minimum: This is the starting value for the first class interval ObsInvestment A 1-4.4 25.8 310.4 41.1 5-5.3 60.1 711.9 89.5 922.5 10-2.3 11-4.7 12-6.8 132.5 141.4 155.5 167.3 174.9 1813 19-2.2 2016.3 215.8 2215.4 236.2 242.7 2513.1 Class width= (Max – Min) / # of intervals # of intervals= (22.5 – (-6.8)) / 6 # of intervals= 4.883 ≈ 5 (Round up to nearest integer) Minimum= -6.8 ≈ -7

6 Class Intervals Lower BoundUpper BoundClass Interval -7-2-7 to < -2-7 + 5 = -2 -23-2 to < 3-2 + 5 = 3 383 to < 83 + 5 = 8 8138 to < 138 + 5 = 13 131813 to < 1813 + 5 = 18 182318 to < 2318 + 5 = 23 5.Start with the lowest (integer) value = 7. Add the class width to calculate the upper bound. The combination of upper and lower bound give the class interval (Don’t forget the inequality to avoid overlaps). Continue in the same fashion until all required class intervals (here 6) are defined.

7 Midpoints 5.Calculate the midpoints of the class intervals: Lower BoundUpper BoundClass IntervalMidpoints -7-2-7 to < -2-4.5(-2 + (-7)) / 2 -23-2 to < 30.5(3 + (-2)) / 2 383 to < 85.5(8 + 3) / 2 8138 to < 1310.5(13 + 8) / 2 131813 to < 1815.5(18 + 13) / 2 182318 to < 2320.5(23 + 18) / 2 midpoint= (Upper Bound + Lower Bound) / 2

8 Tally 6.Sort all (return) observations into the class intervals (or bins). You may use a designated tally column to do so manually or use the FREQUENCY function in Excel (the results are integer values) Lower BoundUpper BoundClass IntervalMidpointsTally -7-2-7 to < -2-4.5||||| -23-2 to < 30.5|||| 383 to < 85.5||||| 8138 to < 1310.5|||| 131813 to < 1815.5||| 182318 to < 2320.5|

9 Observed Frequencies 7.Convert Tally column to observed Frequencies Lower BoundUpper BoundClass IntervalMidpointsTallyFrequency -7-2-7 to < -2-4.5|||||→6 -23-2 to < 30.5||||→5 383 to < 85.5|||||→6 8138 to < 1310.5||||→4 131813 to < 1815.5|||→3 182318 to < 2320.5|→1

10 Cumulative Frequencies 8.Calculate cumulative frequencies as the running subtotal of the frequency column Cumulative Lower BoundUpper BoundClass IntervalMidpointsFrequency -7-2-7 to < -2-4.5666 -23-2 to < 30.55116 + 5 = 11 383 to < 85.561711 + 6 = 17 8138 to < 1310.542117 + 4 = 21 131813 to < 1815.532421 + 3 = 24 182318 to < 2320.512524 + 1 = 25

11 Relative Frequencies 9.Calculate the relative frequencies: CumulativeRelative Lower BoundUpper BoundClass IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.246 / 25 = 0.24 -23-2 to < 30.55110.205 / 25 = 0.20 383 to < 85.56170.246 / 25 = 0.24 8138 to < 1310.54210.164 / 25 = 0.16 131813 to < 1815.53240.123 / 25 = 0.12 182318 to < 2320.51250.041 / 25 = 0.04 Relative Freq.= Frequency / n

12 Cumulative Frequencies 10.Calculate cumulative relative frequencies as the running subtotal of the relative frequency column LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.440.24 + 0.20 = 0.44 383 to < 85.56170.240.680.44 + 0.24 = 0.68 8138 to < 1310.54210.160.840.68 + 0.16 = 0.84 131813 to < 1815.53240.120.960.84 + 0.12 = 0.96 182318 to < 2320.51250.041.000.96 + 0.04 = 1.00

13 Histogram – Data required Select the class intervals as the horizontal axis (x-axis) and the observed frequencies as the vertical (y-axis). The height of the bars in the histogram should represent the observed frequencies for each class interval. LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

14 Histogram

15 Frequency Polygon – Add Intervals First, add two additional class intervals. These should have the same width as the other class intervals. Thus, they can be created by subtracting the class width form the lower bound of the first interval and adding the class width to the upper bound of the last class interval. Midpoints are calculated as before [(-7- 12)/2 = -9.5 and (28 + 23)/2 = 25.5]. The observed frequencies are zero for both new intervals as all observations fall within the old intervals. LowerUpperClassCumulativeRelativeCum. Rel. Bound IntervalMidpointsFrequency -12-7-12 to < -7-9.50 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00 232823 to < 2825.50

16 Frequency Polygon – Data required Select the midpoints as the horizontal axis (x-axis) and the observed frequencies as the vertical (y-axis). Instead of bars use markers (x/y-coordinates). Draw a line through all markers. LowerUpperClassCumulativeRelativeCum. Rel. Bound IntervalMidpointsFrequency -12-7-12 to < -7-9.50 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00 232823 to < 2825.50

17 Frequency Polygon

18 Frequency Polygon – continued Occasionally, the data has a predefined minimum and maximum. Consider the following frequency table of class marks in statistics: LowerUpperClass Bound IntervalMidpointsFrequency -105-10 to < 5-2.50 5205 to < 2012.53 203520 to < 3527.52 355035 to < 5042.56 506550 to < 6557.523 658065 to < 8072.512 809580 to < 9587.53 9511095 to < 110102.50 Using the previous approach leads to Midpoints (or results) and class intervals that are actually impossible. The logical maximum for class marks is = 100, the logical minimum is – 0!

19 Frequency Polygon – Data required The solution is to include the maximum and minimum as two additional points of your frequency polygon (xy- Coordinates: 100/0 and 0/0) LowerUpperClass Bound IntervalMidpointsFrequency 0500 5205 to < 2012.53 203520 to < 3527.52 355035 to < 5042.56 506550 to < 6557.523 658065 to < 8072.512 809580 to < 9587.53 951101000

20 Frequency Polygon – Class Marks

21 Histogram – Freq. Polygon comb.

22 Cum. Freq. Graph – Data required Select the class intervals as the horizontal axis (x-axis) and the cumulative frequencies as the vertical (y-axis). LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

23 Cumulative Frequency Graph

24 less than Ogive – Add interval For the less than Ogive Graph, an additional data point is required. We can add an additional class interval “ < -7 “. The observed frequency is zero for the new interval as all observations fall within the old intervals. LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7< -70 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

25 less than Ogive – Data required Select the upper bounds as the horizontal axis (x-axis) and the cumulative frequencies as the vertical (y-axis). LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7< -70 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

26 less than Ogive

27 Standardising Data It may be desirable to express data in terms of relative frequencies. These were calculated before and are contained in the table below (both discrete as well as cumulative). All Graphs introduced so far can be based on relative frequency rather than observed frequency. LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

28 Relative Frequency Polygon – Data required Select the midpoints as the horizontal axis (x-axis) and the relative frequencies as the vertical (y-axis). Instead of bars use markers (x/y- coordinates). Draw a line through all markers. The relative frequencies for the additional class intervals are = 0 (since the observed frequencies = 0). All that changes in comparison to the observed frequency polygon is the y-axis. The shape of the function remains the same. LowerUpperClassCumulativeRelativeCum. Rel. Bound IntervalMidpointsFrequency -12-7-12 to < -7-9.500.00 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00 232823 to < 2825.500.00

29 Relative Frequency Polygon

30 OR Pie Chart – Data required Select the class intervals as the categories for the pie slices and the relative frequencies as their corresponding values. The size of the slices should be representative of the proportion. Note that the additional categories have relative frequencies = 0.00. Thus, they may be omitted without altering the pie chart itself. Due to the difficulties associated with free-hand drawing pie charts not relevant to your examination! LowerUpperClassCumulativeRelativeCum. Rel. Bound IntervalMidpointsFrequency -12-7-12 to < -7-9.500.00 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00 232823 to < 2825.500.00

31 Pie Chart

32 Cumulative Relative Frequency Graph – Data required LowerUpperClassCumulativeRelativeCum. Rel. Bound IntervalMidpointsFrequency -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00 Select the class intervals as the horizontal axis (x-axis) and the cumulative relative frequencies as the vertical (y-axis).

33 Cumulative Relative Frequency Graph

34 less than Ogive (relative Freq.) – Data required Select the upper bounds as the horizontal axis (x-axis) and the cumulative relative frequencies as the vertical (y-axis). The associated cumulative relative frequency is = 0.00 (since no observations fall below -7). LowerUpperClassCumulativeRelative Cum. Rel. Bound IntervalMidpointsFrequency -7< -700.00 -7-2-7 to < -2-4.5660.24 -23-2 to < 30.55110.200.44 383 to < 85.56170.240.68 8138 to < 1310.54210.160.84 131813 to < 1815.53240.120.96 182318 to < 2320.51250.041.00

35 less than Ogive (relative Freq.)

36 P(X < 0%) i.e. negative Performance. Here ≈ 0.31 or 31%

37 Why use Relative Frequencies? In order to compare two datasets (i.e. Investment A and Investment B), the frequencies need to be standardised to compare the frequency distributions. This is necessary since the sample sizes, class intervals and class width may be different across samples.

38 Graphical Representations Observed Frequencies Relative Frequencies Histogram PolygonOgive discreetcumulative discreet Pie ChartPolygon


Download ppt "SESSION 11 & 12 Last Update 3 rd March 2011 Introduction to Statistics."

Similar presentations


Ads by Google