Newsroom Math and Statistics
Percentage change (NEW/OLD – 1) Comparing new values to old values Or: (new – old)/old €8 million this year, €5 million last year (8/5 – 1) = 1.6 – 1 = 0.6 = 60%, So the budget has increased 60% €5 million this year, €8 million last year (5/8 – 1) = 0.625 – 1 = - 0.375 = -37.5% So the budget has decreased 37.5%
Rates Number of events per some standard unit (per capita, per 100,000, etc.) Crime rates, accident rates, etc. RATE = (EVENTS / POPULATION ) * (“PER” Unit) Use to compare places of different size Make “per unit” large enough so that the smallest events/population result is >1
Consumer Price Index Price Then CPI Then Used to correct for inflation Get the CPI at http://www.ine.pt Price Now = CPI Now Price Then CPI Then
Estimating crowds Beware the “official” estimate from organizers or opponents Better method: Estimate the area in sq meters (L x W) 1 person/meter in a loose crowd Divide by 0,75 for a tighter crowd Account for turnover?
Sample surveys Random sampling: Everyone has the same chance to be picked Margin of sampling error: Error can be caused by other factors Avoid unscientific samples such as convenience or phone-in
Summarizing data sets Three useful measures Center: Mean, median, mode Variability: Maximum/minimum, range, n-tiles, standard deviation Shape: Standard (bell) curve, skewed…use histogram or stemplot to view Empirical Rule for bell curve data: 66% of values within 1 StDev of the mean 95% of values within 2 StDevs of the mean 99.7% of values within 3 StDevs of the mean Look for outliers!
Relationships between measurement variables Correlation: Measures the strength (linearity) of the relationship of two variables Pearson’s r Ranges from -1 to 1 -1 and 1 are perfectly strong 0 means no relationship Positive: Both variables rise together Negative: One goes down as the other goes up REMEMBER: CORRELATION DOESN’T NECESSARILY MEAN CAUSATION
Linear regression Equation of the line on an x-y scatterplot that falls closest to all the points Computer will tell you the slope and intercept R measures how close the points are to the line R2 measures “percent of variance explained” by the independent variable Use to predict a dependent value based on independent value.
Using Excel
What is “data”? Information in table form Columns are the variables Name, date, time, address, age, etc. Rows are the records Persons, incidents, etc.
What Excel can do Import data from many formats Sort data by one or more variables Filter data to show only selected rows Transform data using functions and formulas Summarize data into categories
Importing data Common formats Data Import Wizard will help *.xls (or *.xlsx) Fixed-width text Delimited text (comma, tab, etc) *.dbf files (old dBase) HTML tables Data Import Wizard will help