ID1050– Quantitative & Qualitative Reasoning

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

ID1050– Quantitative & Qualitative Reasoning Computing Statistics ID1050– Quantitative & Qualitative Reasoning

Single-variable Statistics We will be considering six statistics of a data set Three measures of the middle Mean, median, and mode Two measures of spread Variance and standard deviation One measure of symmetry Skewness We can compute these values for either discrete or continuous data.

Mean or Average The mean is defined as the sum of the data divided by the number of data The variable often used is m, the Greek ‘mu’, or 𝑥 . Often m is associated with a population and 𝑥 is associated with a sample. Symbolically, 𝑥 = 𝑥 𝑛 , where 𝑥= 𝑥 1 + 𝑥 2 +…+ 𝑥 𝑛 , and n is the number of data values. (The capital letter sigma,S ,represents summation.) Example: Data is (1, 2, 3, 4, 5). The sum is 1+2+3+4+5=15. There are 5 data values, so the average is 15/5=3. Many calculators have a ‘statistics’ mode. The way the manufacturer chooses to implement statistical calculation varies widely. There are tutorials for this course’s standard calculator, the TI-30Xa, for entering data and computing statistics. If you have a different brand or model, consult your calculator’s user’s manual or website for details how to work with statistics.

Median The median is the middle number when the data is listed in order. If there is an even number of data points, the median is the average of the two middle values. Example: Data is (1,2,3,4,5). The median is 3 Example: Data is (1,2,3,4,5,6). The median is (3+4)/2=3.5 Why is this quantity useful? The median ignores outlying values. What if our data had been (1,2,3,4,1000)? The mean is 202, which is not characteristic of any of the actual values. The median is 3, which is more typical of most of the values. The median is helpful when looking for a house to buy. The median house price is the typical price you’d pay, even though the millionaire’s house at the corner of the block raises the mean of the house prices above the value most people paid for theirs.

Mode The mode represents the most populated class, or the group with the most members. This is yet another reasonable way of finding the middle of the data. Determining the mode is different for discrete data than it is for continuous data. For discrete data, the mode is simply the number that appears the most times. Data is (1, 1, 2, 3, 4, 4, 5, 5, 5). The mode is 5. For continuous data, the mode is the center of the range of the class that has the most members in it. Data is (1.1, 1.2, 1.3, 1.8, 2.0, 2.6, 3.1, 4.6, 4.8, 5.1). The class from 1-2 has the most members. The center of this range is 1.5, so the mode is 1.5. (Note: 1.5 does not even appear in the data.) In both cases, the mode can be quickly determined from the graph. The mode is the x-value that is at the center of the tallest bar in either the bar graph (discrete data) or histogram (continuous data). Data can have two modes (bi-modal), but if there are more, we usually say it is amodal (no distinct mode).

Variance Variance (var. or s2 or s2) is a measure of the spread of data about the average. We don’t care which direction the difference is, so we will be ignoring the sign of the difference. In words, the variance is the sum of the squares of the differences divided by one less than the number of data values. The equation is 𝑣𝑎𝑟.= (𝑥− 𝑥 ) 2 𝑛−1 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 3 -2 4 2 -1 5 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 3 -2 2 -1 4 5 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 3 -2 4 2 -1 5 10 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 2 3 4 5 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 3 2 4 5 Example: Data is (1, 2, 3, 4, 5) and mean ( 𝑥 ) is 3. Variance is 10/(5-1)=2.5 If you are using a calculator, it is most likely that the calculator will compute the standard deviation (s) instead. To get the variance from the standard deviation, simply find the square of the standard deviation: 𝑣𝑎𝑟= 𝜎 2

Standard Deviation Standard deviation (std. dev. or s or s) is a measure of the spread of data about the average. We don’t care which direction the difference is, so we will be ignoring the sign of the difference. In words, the standard deviation is the square root of (the sum of the squares of the differences divided by one less than the number of data values). The equation is 𝑠𝑡𝑑. 𝑑𝑒𝑣.= (𝑥− 𝑥 ) 2 𝑛−1 = 𝑣𝑎𝑟. Example (from previous slide): Data is (1, 2, 3, 4, 5), mean ( 𝑥 ) is 3, and we previously found that the variance is 𝑣𝑎𝑟. =2.5 Since the standard deviation is the square root of variance, Standard deviation is σ= 2.5 =1.58 If you are using a calculator, it is most likely that the calculator will compute the standard deviation (s) as part of its normal statistical function. There is a tutorial for using this course’s standard calculator, the TI-30Xa, to calculate standard deviation. Question: Since standard deviation and variance differ by one keystroke, why do we need both? The units of standard deviation are the same as the data. Variance has other direct uses (e.g. Analysis of Variance) and is also more easily computed.

Skewness The distribution of a set of data may have symmetry about the mean, or it may have a longer ‘tail’ to one side or the other. Imagine draping a sheet over the graph of the data. The side of the sheet that is least steep is the side that has the longer tail. If the tail points to the right (toward positive x values), the skewness will be a positive number. If the tail points to the left, skewness will be negative. Zero skewness indicates symmetric tails to both sides. It is sometimes difficult to estimate from the graph what the skewness will be, but there is a formula for calculating skewness in all cases: Skewness = (mean-mode)/(standard deviation) Data is (1.1, 1.2, 1.3, 1.8, 2.0, 2.6, 3.1, 4.6, 4.8, 5.1). Mean is 2.76 Mode is 1.5 Std. Dev. is 1.56 Skewness = (2.76−1.5) 1.56 = 0.81 (tail to the right)

Example: Discrete Data N: 9 Graph: Mean: 3 Median: 3 Mode: 4 Variance: 2 Standard Deviation: 1.41 Skewness: -0.71 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1 3 -2 4 2 -1 5 27 16

Example: Continuous Data Graph: Mean: 3.3 Median: 3.1 Mode: 2.5 Variance: 1.81 Standard Deviation: 1.35 Skewness: 0.6 𝑥 𝑥− 𝑥 (𝑥− 𝑥 ) 2 1.5 3.3 -1.8 3.24 1.7 -1.6 2.56 2.4 -0.9 0.81 2.5 -0.8 0.64 2.7 -0.6 0.36 3.5 0.2 0.04 3.8 0.5 0.25 4.7 1.4 1.96 5.1 1.8 33 16.34

Conclusion We can answer a great deal of statistical questions by examining the graph and six standard statistical variables for the data: Bar graph or histogram Measures of the middle Mean (can be done on a calculator) Median (obtained from the sorted list of data) Mode (obtained from the graph) Measures of the spread Variance (calculated using a tabular method) [or the square of the std. dev.] Standard Deviation (obtained from calculator’s statistics mode) [or the square root of the variance] Measure of symmetry Skewness (calculated from the above values Mean, Mode, and Std. Dev.)