MEASURES OF CENTRALITY. Last lecture summary Which graphs did we meet? scatter plot (bodový graf) bar chart (sloupcový graf) histogram pie chart (koláčový.

Slides:



Advertisements
Similar presentations
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Advertisements

MEASURES OF CENTRALITY. Last lecture summary Mode Distribution.
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Statistics It is the science of planning studies and experiments, obtaining sample data, and then organizing, summarizing, analyzing, interpreting data,
Chapter 1 & 3.
Copyright (c) Bani Mallick1 Lecture 2 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #2 Population and sample parameters More on populations.
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
Descriptive statistics (Part I)
Very Basic Statistics.
Statistics - Descriptive statistics 2013/09/23. Data and statistics Statistics is the art of collecting, analyzing, presenting, and interpreting data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Describing distributions with numbers
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Lecture 8 Distributions Percentiles and Boxplots Practical Psychology 1.
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
Statistics 3502/6304 Prof. Eric A. Suess Chapter 3.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
STAT 211 – 019 Dan Piett West Virginia University Lecture 1.
What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
Describing distributions with numbers
Descriptive Statistics1 LSSG Green Belt Training Descriptive Statistics.
14.1 Data Sets: Data Sets: Data set: collection of data values.Data set: collection of data values. Frequency: The number of times a data entry occurs.Frequency:
MEASURES OF CENTRALITY. Last lecture summary Which graphs did we meet? scatter plot (bodový graf) bar chart (sloupcový graf) histogram pie chart (koláčový.
Summary Five numbers summary, percentiles, mean Box plot, modified box plot Robust statistic – mean, median, trimmed mean outlier Measures of variability.
Statistics 2. Variables Discrete Continuous Quantitative (Numerical) (measurements and counts) Qualitative (categorical) (define groups) Ordinal (fall.
V pátek nebude přednáška. Cvičení v tomto týdnu bude.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Categorical vs. Quantitative…
Bellwork 1. If a distribution is skewed to the right, which of the following is true? a) the mean must be less than the.
Quantitative data. mean median mode range  average add all of the numbers and divide by the number of numbers you have  the middle number when the numbers.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
The field of statistics deals with the collection,
Welcome to MDM4U (Mathematics of Data Management, University Preparation)
Welcome to MDM4U (Mathematics of Data Management, University Preparation)
Descriptive Statistics  Individuals – are the objects described by a set of data. Individuals may be people, but they may also be animals or things. 
Descriptive Statistics(Summary and Variability measures)
ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.
Elementary Statistics (Math 145) June 19, Statistics is the science of collecting, analyzing, interpreting, and presenting data. is the science.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
What is Statistics?. Statistics 4 Working with data 4 Collecting, analyzing, drawing conclusions.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Statistics Review  Mode: the number that occurs most frequently in the data set (could have more than 1)  Median : the value when the data set is listed.
Math 145 June 19, Outline 1. Recap 2. Sampling Designs 3. Graphical methods.
Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
Statistics Vocabulary. 1. STATISTICS Definition The study of collecting, organizing, and interpreting data Example Statistics are used to determine car.
ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.
Central Tendency  Key Learnings: Statistics is a branch of mathematics that involves collecting, organizing, interpreting, and making predictions from.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Prof. Eric A. Suess Chapter 3
Box and Whisker Plots or Boxplots
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
All About that Data Unit 6 Data.
CHAPTER 5 Basic Statistics
1st Semester Final Review Day 1: Exploratory Data Analysis
Description of Data (Summary and Variability measures)
T6.1 – Introduction to Statistics
Unit 7: Statistics Key Terms
Unit 5 Stats.
Box and Whisker Plots Algebra 2.
Elementary Statistics (Math 145)
Welcome!.
Probability and Statistics
Advanced Algebra Unit 1 Vocabulary
Describing Data Coordinate Algebra.
Presentation transcript:

MEASURES OF CENTRALITY

Last lecture summary Which graphs did we meet? scatter plot (bodový graf) bar chart (sloupcový graf) histogram pie chart (koláčový graf) How do they work, what are their advantages and/or disadvantages?

SDA women – histogram of heights 2014 n = 48 or N = 48 bin size = 3.8

Distributions negatively skewed skewed to the left positively skewed skewed to the left e.g., life expectancye.g., body heighte.g., income

STATISTICS IS BEATIFUL new stuff

Life expectancy data Watch TED talk by Hans Rosling, Gapminder Foundation: tats_you_ve_ever_seen.html tats_you_ve_ever_seen.html

STATISTICS IS DEEP

UC Berkeley Though data are fake, the paradox is the same Simpson’s paradox – Introduction to statistics

Male AppliedAdmittedRate [%] MAJOR A MAJOR B – Introduction to statistics

Male AppliedAdmittedRate [%] MAJOR A MAJOR B – Introduction to statistics

Female AppliedAdmittedRate [%] MAJOR A10080 MAJOR B – Introduction to statistics

Female AppliedAdmittedRate [%] MAJOR A10080 MAJOR B – Introduction to statistics

Gender bias What do you think, is there a gender bias? Who do you think is favored? Male or female? AppliedAdmittedRate [%] MAJOR A MAJOR B10010 AppliedAdmittedRate [%] MAJOR A10080 MAJOR B – Introduction to statistics

Gender bias AppliedAdmittedRate [%] MAJOR A MAJOR B10010 Both AppliedAdmittedRate [%] MAJOR A10080 MAJOR B Both male female – Introduction to statistics

Gender bias Rate [%] MAJOR A50 MAJOR B10 Both46 Rate [%] MAJOR A80 MAJOR B20 Both26 male female – Introduction to statistics

Statistics is ambiguous This example ilustrates how ambiguous the statistics is. In choosing how to graph your data you may majorily impact what people believe to be the case. “I never believe in statistics I didn’t doctor myself.” “Nikdy nevěřím statistice, kterou si sám nezfalšuji.” Who said that? Winston Churchill – Introduction to statistics

What is statistics? Statistics – the science of collecting, organizing, summarizing, analyzing and interpreting data Goal – use imperfect information (our data) to infer facts, make predictions, and make decisions Descriptive statistic – describing and summarising data with numbers or pictures Inferential statistics – making conclusions or decisions based on data

Variables variable – a value or characteristics that can vary from individual to individual example: favorite color, age How variables are classified? quantitative variable – numerical values, often with units of measurement, arise from the how much/how many question, example: age, annual income, number children continuous (spojitá proměnná), example: height, weight discrete (diskrétní proměnná), example: number of children continuous variables can be discretized

Variables categorical (qualitative) variables categories that have no particular order example: favorite color, gender, nationality ordinal they are not numerical but their values have a natural order example: tempterature low/medium/high

variable (proměnná) quantitative (kvantitativní) categorical (kategorická) continuous (spojitá) discrete (diskrétní) ordinal (ordinální) Variables

Choosing a profession ChemistryGeography – – – Statistics

Choosing a profession We made an interval estimate. But ideally we want one number that describes the entire dataset. This allows us to quickly summarize all our data. – Statistics

Choosing a profession 1. The value at which frequency is highest. 2. The value where frequency is lowest. 3. Value in the middle. 4. Biggest value of x-axis. 5. Mean ChemistryGeography – Statistics

Three big M’s The value at which frequency is highest is called the mode. i.e. the most common value is the mode. The value in the middle of the distribution is called the median. The mean is the mean (average is the synonymum). ChemistryGeography – Statistics

Quick quiz What is the mode in our data? – Statistics

Mode in negatively skewed distribution – Statistics

Mode in uniform distribution – Statistics

Multimodal distribution – Statistics

Mode in categorical data – Statistics

More of mode True or False? 1. The mode can be used to describe any type of data we have, whether it’s numerical or categorical. 2. All scores in the dataset affect the mode. 3. If we take a lot of samples from the same population, the mode will be the same in each sample. 4. There is an equation for the mode. Ad mode changes as you change a bin size. Because 3. is not true, we can’t use mode to learn something about our population. Mode depends on how you present the data. – Statistics

Life expectancy data – Statistics: Making Sense of Data

Minimum Sierra Leone minimum = – Statistics: Making Sense of Data

Maximum Japan maximum = – Statistics: Making Sense of Data

Life expectancy data all countries – Statistics: Making Sense of Data

Life expectancy data Egypt half larger half smaller – Statistics: Making Sense of Data

Life expectancy data Minimum = 47.8 Maximum = 83.4 Median = – Statistics: Making Sense of Data

Q Sao Tomé & Príncipe 50 (¼ way) 1 st quartile = – Statistics: Making Sense of Data

Q1 ¾ larger¼ smaller 1 st quartile = – Statistics: Making Sense of Data

Q Netherland Antilles 148 (¾ way) 3 rd quartile = – Statistics: Making Sense of Data

Q3 3 rd quartile = 76.7 ¾ smaller¼ larger – Statistics: Making Sense of Data

Life expectancy data Minimum = 47.8 Maximum = 83.4 Median = st quartile = rd quartile = – Statistics: Making Sense of Data

Box Plot – Statistics: Making Sense of Data

Box plot 1 st quartile 3 rd quartile median minimum maximum

Modified box plot IQR interquartile range 1.5 x IQR outliers

Quartiles, median – how to do it? 79, 68, 88, 69, 90, 74, 87, 93, 76 Find min, max, median, Q1, Q3 in these data. Then, draw the box plot. – Statistics: Making Sense of Data

Another example Min. 1st Qu. Median 3rd Qu. Max , 93, 68, 84, 90, 74

Percentiles věk [roky]

3 rd M – Mean

Salary of 25 players of the American football (NY red Bulls) in median = mean = Mean is not a robust statistic. Median is a robust statistic. Robust statistic

10% trimmed mean … eliminate upper and lower 10% of data Trimmed mean is more robust. Trimmed mean median = mean = % trimmed mean =