 Statistical analysis involves many mathematical operations which depends on how our variables are measured  Using number 1 to represent “Female”: 1.

2  Statistical analysis involves many mathematical operations which depends on how our variables are measured  Using number 1 to represent “Female”: 1 here is only the symbol.  Using number 1 to represent the only one child in the family: 1 here means real quantity.

3  Nominal: numbers or other symbols are assigned to a set of categories for the purpose of naming, labeling or classifying the observations.  For example:1=female, 2=male  Number here does not carry any quantitative difference.

4  Ordinal: numbers are assigned to rank- ordered categories ranging from low to high.  For example: upper class, middle class or working class  We know that upper class is higher than middle class  But we do not know the magnitude of differences between the categories, we do not know how much higher upper class is compared with the middle class

5  Interval-ratio: If the categories (or values) of a variable can be rank-ordered, and if the measurements for all the cases are expressed in the same units,  Example: age, income, SAT scores  We can compare values not only in terms of which is larger or smaller but also in terms of how much larger or smaller one is compared with another.  Variables with a natural zero point are also called ratio variables.

9  Several key social factors (gender, employment status, martial status) are dichotomies.  They are nominal

10  Discrete vs. continuous variables  Discrete: number of kids  Continuous: Length or weights

11  The number of people in your family  Place of residence classified as urban, suburban, or rural  The percentage of university students who attended public high school  The rating of the overall quality of a textbook, on a scale from “Excellent” to “Poor”  The type of transportation a person takes to work  Your annual income  The U.S. unemployment rate  The presidential candidate that the respondent voted for in 2012

12  The overall goal of central tendency is to find the single score that is most representative for the distribution.

15  The sample mean is the measure of central tendency which can approximate the population mean  The mean is very sensitive to extreme scores  It can put the mean in some extreme direction  Make it less representative  Less useful as a measure of central tendency

16 LocationNumber of annual customers Lanham Park Store2150 Williamsburg Store1534 Downtown Store3564 The mean or average number of shoppers in each store? Using Excel to do that use your own formula use AVERAGE function

17  It is defined as the midpoint in a set of scores  50% of the scores fall above and one half fall below.

18  Odd number of data  Rank them  Median=middle one  Example: 10, 9, 8, 7, 5 (median=8)  Even number of data  Rank them  Median= sum of two middle data/2  Example: 10, 9, 8, 7, 6, 5 (median=(8+7)/2=7.5)

19  The median is insensitive to extreme cases, where the mean is not.  To measure the central tendency:  Have some extreme data, using median  No extreme data, using mean  Example: 14, 3, 2, 1, (mean=5, median=2.5)  Which represents better the central tendency?

20  Calculate the median of income level

21  The mode is the value that occurs most frequently.  Count the frequency of all the values in a distribution  The value that occurs most often is the mode

22  Ten Most Common Foreign Languages Spoken in the United State, 2009 LanguageNumber of Speakers Spanish35,468,501 Chinese2,600,150 Tagalog1,513,734 French1,305,503 Vietnamese1,251,468 German1,109,216 Korean1,039,021 Russian881,723 Arabic845,396 Italian753,992 Mode: Spanish

23  Listed are the weather conditions of 10 US cities on 11/14/2014. What is the mode? ChicagoCloudy Los AngelesSunny Washington DCPartly Cloudy New YorkCloudy SeattleCloudy Salt Lake CitySnow BostonPartly Cloudy PhoenixMostly Cloudy LexingtonMostly Cloudy New OrleansFair

24  Mean:  No extreme scores and are not categorical  Median  Extreme scores and you do not want to distort the average  Mode  Data are categorical in nature and values can only fit into one class  E.g. hair color, political affiliation, religion

25  Input the table to Excel  Select the data as Input Range  click Data  Data Analysis  in Data Analysis box, choose Descriptive Statistics  tick “Labels in first row”  Output Range=C1  tick “Summary statistics”  click “OK” Income Level $135,456 $54,365 $37,668 $34,500 $32,456 $25,500


28  Writing a sale report to your boss according to the figures of things sold today: SpecialNumber SoldCost Huge Burger20$2.95 Baby Burger18$1.49 Chicken Littles25$3.50 Porker Burger19$2.95 Yummy Burger17$1.99 Coney Dog20$1.99

29  Calculate the average sale ToyJuly saleAugust SaleSeptember Sale slammer12345.0014453.0015435.00 radar zinger31454.0034567.0029678.00 lazertags3253.003121.005131.00

30  Patient record  Mean and median, which is better for what? 12/1-12/712/8-12/1512/16-12/23 0-4 years121415 5-9 years151214 10-14 years122421 15-19 years381219

