Download presentation
Presentation is loading. Please wait.
1
1. Data Processing Sci Info Skills
2
Statistics – making sense of numbers
Data Collection Data Organisation Data Interpretation Data Presentation
3
Types of statistics Descriptive
the processed data provides a summary of the observations/measurements, such as averages, variations and graphs Inferential the processed data is used to make judgements or predictions, such as trends, indications of variations between different samples
4
Class Exercise 1.1 the average December temperature in Sydney has increased by 1C in the last 50 years it is expected that the average December temperature in Sydney will increase by another 1C within 25 years 25% of people surveyed at a shopping centre indicated that they were aware of increasing temperatures in Sydney A survey has a shown that 75% of Sydneysiders are ignorant of the changing climatic conditions in their city Descriptive Inferential Descriptive Inferential
5
Class Exercise 1.2 Identify the sample and the population in the following. (a) a bottle of water is taken from a dam to be tested Sample – the water in the bottle Population – all the water in the dam (b) the frog population of a large wetland is checked by looking at two separate hectares Sample – the two hectares Population – the whole wetlands
6
Class Exercise 1.2 (c) the levels of lead in fallout around a smelter are assessed by testing a selection of properties Sample – the selected properties Population - the whole area (d) people in shopping centre are asked their opinions … to determine the level of awareness in the community Sample – the people asked Population – the community
7
Variables characteristic being measured
category - result of measurement is a “word”, e.g. yes (or no), truck, bird, sparrow, first (or second) etc numerical - measurement produces number could be limited to certain values (e.g. whole numbers) any value (e.g. mass of an object) Exercise 1.3 lead levels in fallout types of birds observed numbers of birds observed in different locations numerical – any value category numerical –set values
8
Presenting & organising data
large quantities of raw data are not useful for presenting the results of the tests they need to organised to show the results in a smaller scale tables graphs averages comparisons
9
Tabulating data organising it so that it can be evaluated more easily
generally some sort of table category data is most usually grouped (tallied) the number of times each different category occurs is the recorded result can also be used where the data is numerical only with fixed and pre-known values a large number of data points numerical (all values) data presents problem must be grouped into ranges information is lost, e.g. 0.1 and 4.9 both fit into 0-5 range
10
Grouping numerical data
identify the minimum and maximum values decide how many groups are appropriate for the size of the dataset determine the groups (which should be equivalent ranges – for example, 0-5,6-10 etc, but not 0-5, 6-20) Class Exercise 1.4 You have a data set of 100 pH measurements of river water, ranging from 5 to 9. What would be an appropriate way of grouping them? 8 ranges of 0.5 e.g , etc
11
Frequencies number of times a particular value or range occurs is the frequency spread of data across the range of values is the distribution Is it evenly spread across the groups? Do certain groups have higher frequencies? Is there any pattern? frequency should considered in relation to total number of data values relative frequency – the proportion (often as a percentage) of the frequency of the total dataset
12
Excel & tally charts manually tallying - how many occurrences of each value – of large data sets is boring, tiring and potentially inaccurate Excel has some functions which help: COUNT(cell range) COUNTIF(range , criterion) FREQUENCY (range , group) – probably more trouble than it’s worth
13
COUNT ( ) returns the total number of cells with numerical data
ignores blank cells and non-numerical values A B 1 10 2 3 7 4 5 n/a 6 *** 8 9 =count(A1:A9)
14
COUNTIF ( ) returns the number of cells meeting a given criteria
criteria include =, > or < A B 1 10 2 3 7 4 5 n/a 6 *** 8 9 =countif(A1:A9,”>5”)
15
FREQUENCY(,) tally data into user-chosen groups
entered as an array formula highlight a group of cells where you want the frequencies to appear type in the formula and then hit the key combination CTRL+SHIFT+ENTER A B 1 10 2 3 4 5 n/a 6 *** 7 8 9 values for groups 0-5, 6-10 =frequency(A1:A9,B6:B7)
16
Two-way frequency tables
One sample set – two variables Sex of koala General state of health Male Female Healthy 45 28 Ill 21 9 Two sample sets – one variable Type of parkland Origin of plant Urban Undeveloped Native 37% 65 Introduced 51 20 Not identified 12 15
17
The typical value represents all the data values with one or two
average – some way of representing the “most common” value variation – how much spread there is in data set category variables – class with highest frequency (mode) variation cannot be measured numerical variables mean – what we normally refer to as average mode – most common value (used in grouped data) median – the value in the middle when arranged in order range – highest – lowest standard deviation – calculation of difference of all points from mean mean & std dev normally used in scientific data
18
Assignment 1 large amount of data simple formulas required
all questions and directions contained in Excel spreadsheet
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.