STATISTIK PENDIDIKAN EDU5950 SEM1 2013-14 STATISTIK DESKRIPTIF: UKURAN SEBARAN Rohani Ahmad Tarmizi - EDU5950
UKURAN-UKURAN SEBARAN JULAT SISIHAN MIN VARIANS SISIHAN PIAWAI
UKURAN SEBARAN Setelah mempelajari ukuran kecenderungan memusat UKURAN TAHAP , maka persoalannya seterusnya adalah bagaimanakah skor-skor itu bersebar sama ada tersebar-sebar atau terkumpul-kumpul. Ini membawa kepada konsep UKURAN SEBARAN IA ITU suatu indeks atau petunjuk sejauh mana skor-skor dalam taburan tersebar. Ukuran kecenderungan memusat dan ukuran sebaran merupakan petunjuk yang sangat penting untuk data kuantitatif, oleh itu sangat kerap digunakan oleh penyelidik dan dilaporkan dalam sesuatu penulisan.
TEKNIK MEMPERIHAL DATA - UKURAN SEBARAN JULAT - ukuran paling mudah tetapi kasar SISIHAN MIN - ukuran purata beza mutlak bagi skor-skor daripada min VARIANS - purata hasil tambah kuasa dua sisihan skor-skor daripada min SISIHAN PIAWAI - punca kuasa dua bagi purata hasil tambah kuasa dua sisihan skor-skor daripada min
UKS - JULAT Julat adalah skor beza antara skor tertinggi dan terendah Set A: 21 22 23 24 25 26 27 6 Set B: 15 18 21 24 27 30 33 18
JULAT Ukuran yang paling mudah ia itu dengan menentukan beza antara skor tertinggi dengan terendah. Skor yang tinggi (SET B = 18) memberi gambaran bahawa ukuran sebarannya lebih besar daripada (SET A = 6). Dengan itu kita dapat memperkatakan sungguhpun min kedua-dua set adalah sama tetapi sebarannya berbeza. Ini menggambarkan kompsisi skor-skor dalam taburan tersebut. Walau bagaimanapun kegunaan julat adalah terlalu terhad oleh kerana ia mengguna dua skor dalam sesuatu set. Oleh itu, ia hanya diguna untuk mendapat gambaran yang cepat.
SISIHAN MIN Sisihan min merupakan ukuran purata bagi perbezaan skor-skor daripada min. Untuk mengira sisihan min L1: Tentukan min bagi taburan L2: Tentukan sisihan bagi setiap skor daripada min taburan tersebut L3: Tentukan nilai mutlak bagi setiap nilai sisihan- sisihan L4: Jumlahkan kesemua sisihan-sisihan L5: Bahagikan jumlah tersebut dengan bilangan skor dalam taburan tersebut
UKS - SISIHAN MIN Set A: 21 22 23 24 25 26 27 X 21 22 23 24 25 26 27 SM -3 -2 -1 1 2 3
UKS - SISIHAN MIN Set B: 15 18 21 24 27 30 33 X 15 18 21 24 27 30 33 SM -9 -6 -3 3 6 9
VARIANS Varians ditakrifkan sebagai purata hasil tambah kuasa dua sisihan-sisihan daripada min. Untuk mengira varians L1: Tentukan min bagi taburan L2: Tentukan sisihan bagi setiap skor daripada min taburan tersebut L3: Tentukan nilai kuasa dua bagi setiap nilai sisihan-sisihan L4: Jumlahkan kesemua sisihan-sisihan yang telah dikuasakan dua L5: Bahagikan jumlah tersebut dengan bilangan skor dalam taburan tersebut.
UKS - VARIANS X 21 22 23 24 25 26 27 X- -3 -2 -1 1 2 3 (X-)2 9 4 X 1 2 3 (X-)2 9 4 X 15 18 21 24 27 30 33 X- -9 -6 -3 3 6 9 (X-)2 81 36
SISIHAN PIAWAI Sisihan piawai pula ditakrif sebagai punca kuasa dua nilai varians Ini bermakna setelah menentukan varians, anda boleh kirakan sisihan piawai dengan menentukan nilai kuasa dua bagi varians. Nilai sisihan piawai adalah kecil dan dikatakan dalam unit yang diukur manakala nilai varians adalah besar oleh kerana ia merupakan hasil kuasa dua sisihan-sisihan. Oleh itu, nilai sishan piawai lebih cekap bagi menggambarkan sebaran
PENGIRAAN VARIANS DATA BERKUMPUL KELAS FREK. X TT (X TT – )2 f(X TT – )2 5-9 2 7 103.63 207.26 10-14 11 12 26.83 295.13 15-19 26 17 0.03 0.78 20-24 22 23.23 394.91 Min = 962/56 = 17.1785 = 17.18
PENGIRAAN VARIANS DATA BERKUMPUL KELAS FREK. X TT (X TT – )2 f(X TT – )2 5-9 2 7 103.63 207.26 10-14 11 12 26.83 295.13 15-19 26 17 0.03 0.78 20-24 22 23.23 394.91 S2 = 898.08/56 = 16.04 S = 4.0046 S = 4.00
LATIHAN: PENGIRAAN VARIANS DATA BERKUMPUL KELAS FREK. X TT (X TT – X)2 F(X TT – X)2 5-9 2 7 10-14 11 12 15-19 26 17 20-24 22 25-29 8 27
RINGKASAN Ukuran kecenderungan memusat dan ukuran sebaran merupakan ukuran yang paling popular digunakan untuk pemerihalan data disamping menyaji data secara jadual/carta atau graf. UKM menunjukkan tahap (level) manakala UKS menunjukkan kebolehubahan (homogeneity/heterogeneity) UKM yang paling kerap digunakan adalah min manakala UKS yang disertai adalah sisihan piawai. Cuba anda beri sebab kenapa min dan sisihan piawai kerap digunakan.
TAFSIRAN UKURAN SEBARAN Ukuran yang besar menunjukkan sebaran/serakan/variasi yang besar. Ukuran yang besar mennunjukkan skor-skor adalah heterogen (jauh berbeza-beza). Ukuran yang yang kecil menunjukkan sebaran/serakan/variasi yang kecil Ukuran yang kecil menunjukkan skor adalah homogen (hampir sama).
RINGKASAN Ukuran kecenderungan memusat dan ukuran sebaran merupakan ukuran yang paling popular digunakan untuk pemerihalan data disamping menyaji data secara jadual/carta atau graf. UKM menunjukkan tahap (level) manakala UKS menunjukkan kebolehubahan (homogeneity/heterogeneity) UKM yang paling kerap digunakan adalah min manakala UKS yang disertai adalah sisihan piawai. Cuba anda beri sebab kenapa min dan sisihan piawai kerap digunakan.
Descriptive Statistics Lets look at the following set of data from two groups of students undergoing two different approaches in learning The mean, the median and the mode for each were as follows PBL Approach Traditional Approach 56 33 56 42 57 48 58 52 61 57 63 67 67 77 67 82 67 90 Closely alike Very different Use this example to review the measures of central tendency. Both sets of data have the same mean, the same median and the same mode. Students clearly see that the data sets are vastly different though. A good lead in for measures of variation. Mean = 61.5 Median =62 Mode= 67 Mean = 61.5 Median =62 Mode= 67 Nota Tambahan
Measures of Variability/Dispersion Measure or index which convey about the degree to which the scores differ from one another. Measures that reflect the amount of variation in the scores of a distribution. Nota Tambahan
Measures of Variability/Dispersion ♠ Provides a measure of the dispersion of your data ♠ Measures include: i) Range – presented as the lowest to the highest values ii) Variance – the average of squared deviations from mean iii) Standard deviation – provides a measure of deviation from mean which is calculated as square root of the variance ♠ Amongst the three measures of dispersion, standard deviation is the most frequently used. Nota Tambahan
Measures of Variability Range= Maximum value-Minimum value Range of scores for Set A = 67 - 56 = 11 Range of scores for Set B = 90 - 33 = 57 The range only uses 2 numbers from a data set, therefore it is only a rough and quick measure. Explain that if only one number changes, the range can be vastly changed. If a $56 stock dropped to $12 the range would change. Emphasize the need for different symbols for population parameters and sample statistics. Nota Tambahan
Population Variance Population Variance: The sum of the squares of the deviations, divided by N. Each deviation is squared to eliminate the negative sign. Students should know how to calculate these formulas for small data sets. For larger data sets they will use calculators or computer software programs. Nota Tambahan
SET A (PBL APPROACH) - Variance 56 -5.5 30.25 57 -4.5 20.25 58 -3.5 12.25 61 -0.5 0.25 63 1.5 2.25 67 5.5 30.25 Each deviation is squared to eliminate the negative sign. Students should know how to calculate these formulas for small data sets. For larger data sets they will use calculators or computer software programs. Sum of squares 188.50 Nota Tambahan
SET B (TRADITIONAL APPROACH- Variance 33 -28.5 812.25 42 -19.5 380.25 48 -13.5 182.25 52 -9.5 90.25 57 -4.5 20.25 67 5.5 30.25 77 15.5 240.25 82 20.5 420.25 90 28.5 812.25 Each deviation is squared to eliminate the negative sign. Students should know how to calculate these formulas for small data sets. For larger data sets they will use calculators or computer software programs. Sum of squares 2988.25 Nota Tambahan
Population Standard Deviation Population Standard Deviation The square root of the population variance. The variance is expressed in “square units” which are meaningless. Using a standard deviation returns the data to its original units. The population standard deviation for students in the PBL group is 4.34 Nota Tambahan
Population Standard Deviation Population Standard Deviation The square root of the population variance. The variance is expressed in “square units” which are meaningless. Using a standard deviation returns the data to its original units. The population standard deviation for students in the Traditional group is 17.29 Nota Tambahan
Sample Variance (Set A) To calculate a sample variance divide the sum of squares by n-1. Samples only contain a portion of the population. This portion is likely to not contain extreme values. Dividing by n-1 gives a higher value than dividing by n and so increases the standard deviation. Nota Tambahan
Sample Variance (Set B) To calculate a sample variance divide the sum of squares by n-1. Samples only contain a portion of the population. This portion is likely to not contain extreme values. Dividing by n-1 gives a higher value than dividing by n and so increases the standard deviation. Nota Tambahan
Sample Standard Deviation (Set A) The sample standard deviation, s is found by taking the square root of the sample variance. Samples only contain a portion of the population. This portion is likely to not contain extreme values. Dividing by n-1 gives a higher value than dividing by n and so increases the standard deviation. Nota Tambahan
Sample Standard Deviation (Set B) The sample standard deviation, s is found by taking the square root of the sample variance. Samples only contain a portion of the population. This portion is likely to not contain extreme values. Dividing by n-1 gives a higher value than dividing by n and so increases the standard deviation. Nota Tambahan
Summary Range= Maximum value-Minimum value Population Variance Population Standard Deviation A summary of the formulas Sample Variance Sample Standard Deviation Nota Tambahan
(ΣΧ)² ΣΧ² - n s² = n - 1 (96)² 652 - 15 s² = 15 - 1 37.6 s² = 14 Board Demonstration Calculate range, variance and std dev for the 3 data set Raw data ♠ Range = 9 – 3 ♠ Variance (s²) Before you can solve for variance, you need to determine: n = 15 ΣΧ = 96 ΣΧ² = 652 ♠ Std dev (s) Data set 1: 5 8 9 7 6 8 6 7 7 6 5 3 7 8 4 (ΣΧ)² ΣΧ² - n s² = n - 1 (96)² 652 - 15 s² = 15 - 1 s = √s² 37.6 s = √2.686 s² = 14 s = 1.639 = 2.686 Nota Tambahan
(ΣΧ)² ΣΧ² - n s² = n - 1 (146)² 1868 - 15 s² = 15 - 1 446.933 s² = 14 Board Demonstration Calculate range, variance and std dev for the 3 data set Raw data ♠ Range = 16 – 3 ♠ Variance (s²) Before you can solve for variance, you need to determine: n = 15 ΣΧ = 146 ΣΧ² = 1868 ♠ Std dev (s) Data set 2: 15 8 9 7 16 8 16 7 7 6 15 3 7 8 14 (ΣΧ)² ΣΧ² - n s² = n - 1 (146)² 1868 - 15 s² = 15 - 1 s = √s² 446.933 s = √31.924 s² = 14 s = 5.65 = 31.924 Nota Tambahan
…Cont. Frequency distribution ♠ Range = 45 – 25 = 20 ♠ Variance (s²) Before you can solve for variance, you need to determine: n = 71 ΣfΧ = 2,434 ΣfΧ² = 85,810 ♠ Std dev (s) s = √s² s = √33.357 = 5.776 = 5.78 Data set: X f fx__ 25 6 150 28 9 252 30 12 360 34 17 578 38 15 570 43 8 344 45 4 180 Total 71 2434 (ΣfΧ)² ΣfΧ² - s² = n n 2434² 85810 - 71 s² = 71 2368.37 s² = 71 = 33.357 Nota Tambahan
…Cont. 3. Grouped Frequency distribution ♠ Range Data set: Not relevant ♠ Variance (s²) Before you can solve for variance, you need to determine: Xmidpt = 25.5, 35.5 and 45.5 n = 71 Σf Xmidpt = 2,370.5 Σf Xmidpt ² = 82,727.75 ♠ Std dev (s) s = √s² s = √50.466 = 7.104 Data set: group f 21 – 30 27 31 – 40 32 41 – 50 12 Total 71 Nota Tambahan
To approximate the mean of the data in a frequency distribution, treat each value as if it occurs at the midpoint of its class. Class midpoint = x. Grouped Data n = f Emphasize that the data consists of 30 values. Occasionally students will try to divide by the number of classes. Always be sure they check the reasonableness of their answer. Nota Tambahan
To approximate the standard deviation of the data in a frequency distribution, use class midpoint = x. Grouped Data n = f Class f Midpoint 67- 78 3 72.5 739.84 2119.52 79- 90 5 84.5 231.04 1155.20 91- 102 8 96.5 10.24 81.92 103-114 9 108.5 77.44 696.96 115-126 120.5 602.5 3012.50 30 7061.1 Nota Tambahan