Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 4 Frequencies.

Similar presentations


Presentation on theme: "Week 4 Frequencies."— Presentation transcript:

1 Week 4 Frequencies

2 Using statistics in Nursing
Carrying out research means the collection of data. Statistics are a way of making use of this data. They “tell the story”. Descriptive Statistics: used to describe characteristics of our sample Statistics describe samples Inferential Statistics: used to generalise from our sample to our population Parameters describe populations Any samples used should therefore be representative of the target population We talked last time about procedures and ways in which observations were made and data was collected. The rest of the course is really about what to do with your data once you’ve got it. Raw data as it is collected is often confusing, hard to make sense of. Using statistics is a way of making sense of your data, to see if it reveals any patterns or significant relationships between variables. In order to analyse your data you need to first sort through it, looking at certain characteristics that tell you exactly what kind if data you have. These two processes of description and analysis are the two different types of stat processes used: DS is really a way of making sense of your data: once you have described your data in detail you can then go on to inferential statistics which. So (read DS)) In collecting, describing and analysing our data, we are dealing with two types of groups: our target population which is every individual we are interested in and our sample which is the group of participants we actually test. Most of the time we cannot measure all of the population so we use a sample to gain a picture of it. Normally when using DS we are describing the characteristics of our sample data. When we use IS, we are studying sample data in order to make generalizations about characteristics or relationships present in the target population. Therefore any sample that we select should be representative of the population, however, any sample will not give a perfectly accurate picture of its corresponding population. . When using techniques to describe/analyse our data, we talk about statistics when describing samples and parameters describing populations: so the average of a sample would be a statistic, while the average of population would be a parameter. It is good to keep in mind the difference and relationship between a population and a sample taken from that population, as the fact that samples do not represent populations 100% accurately is taken into account when using certain statistics.

3 Descriptive Statistics
Statistical procedures used to summarise, organise, and simplify data. This process should be carried out in such a way that reflects overall findings Raw data is made more manageable Raw data is presented in a logical form Patterns can be seen from organised data Frequency tables Graphical techniques Measures of Central Tendency Measures of Spread (variability) So Ds as I’ve said are a way of making sense of your data. To begin with you may just have a table of raw scores, that you need to organise in some way. Today we’re going to concentrate on the ways in which you can summarise, organise and simplify your data. DS should complement any use of IS by illustrating any of the significant relationships found: so if you find that manipulating levels var A causes changes in DV B, you want statistics to show this clearly (if changing noise level effects mood: you want to show DS about the two different mood levels). So we start off with our raw data: what do we want to achieve with DS (read). In order to do this we have a variety of options available including (read types). Apart from the first two types (which are different ways of showing the same info), these should be used in conjunction with each other: as each technique or statistic shows a different important aspect of the data.

4 Plotting Data: describing spread of data
A researcher is investigating Diabetes self-care knowledge: scores are recorded for 20 participants: 4, 6, 3, 7, 5, 7, 8, 4, 5,10 10, 6, 8, 9, 3, 5, 6, 4, 11, 6 We can describe our data by using a Frequency Distribution. This can be presented as a table or a graph. Always presents: The set of categories that made up the original category The frequency of each score/category Three important characteristics: shape, central tendency, and variability

5 Frequency Distribution Tables
Highest Score is placed at top All observed scores are listed Gives information about distribution, variability, and centrality X = score value f = frequency fx = total value associated with frequency f = N X =fX So here is the data from the example memory study presented as a FD table: the column labelled X refers to the different scores (so here the numbers of symbols that were remembered) and the f column shows the frequency of these scores (so how many remembered 3 etc.). A FD table always has the highest score at the top and lists all possible scores down to the lowest score at the bottom. With such a table it is easy to see that most participants remembered around 6 symbols. The last column on the right labelled fx is not necessary for FD, and shows how much of the data set can be attributed to each possible score: so for the score of 11 symbols remembered, 1 person scored 11 so the fx value is 11. For the score of 10 symbols remembered, 2 people remembered 10 which makes fx 2 x 10 =20. The fx column is the (read fx). You should see from the FD table that using it you can work out certain values: the total number of participants (or N) is gained simply by adding all the values in the f column (which should give up 20). To get the total sum of the scores: which can be obtained by adding all the values of the fx column (which gives us a value of 127)

6 Frequency Table Additions
Frequency tables can display more detailed information about distribution Percentages and proportions p = fraction of total group associated with each score (relative frequency) p = f/N As %: p(100) =100(f/N) What does this tell about this distribution of scores? As well as information about scores and frequency of scores: a FD table can also give us more detailed information about the distribution of scores within a data set. Two common measures concern information about the proportion of score frequency. This involves adding columns that shows the information in column, f and fx (i.e. frequency of score and total value associated with frequency) as a proportion or percentage value. So column p shows what proportion of the group remembered each amount of symbols: we can see that a proportion of 0.05 of participants recalled 11 symbols. We convert the frequencies into proportions by simply dividing the frequency by the number of participants so 1/20 = We can also convert the frequencies into % values simply by multiplying the proportion by 100. So we know that 1 participant recalled 11 symbols, this is a proportion of 0.05 (1/20), and 5% (0.05 x 100). Including the proportions and % is another way of summarising the data: again it shows which score was most common, and how may symbols were recalled how often. So it clearly shows info about the distribution of the scores.

7 Representing data as graphs
Frequency Distribution Graph presents all the info available in a Frequency Table (can be fitted to a grouped frequency table) Uses Histograms Bar width corresponds to real limits of intervals Histograms can be modified to include blocks representing individual scores Another way to present the basic information on a FDt (about distribution) is to use a graph. The two axis of the graph present the two types of information: the x (horizontal) axis shows the scores (the X column), while the y (vertical) axis) show the frequencies with which those scores occur (the f column). Graphs that can display FD info about interval/ration data are histograms: the top graph is a histogram of the data from the remembering symbols study: it shows all the info from the X and f columns: the y axis represents frequency: with the height of the bars increasing with increasing frequency, while the x axis shows the different scores: the width of the bars correspond to the real limits of the score intervals: the bottom right histogram shows the info from the GFDt: here you can see the class intervals on the x axis, so the first bar on the left represents the interval of (which had a frequency of 3: 3 people had scores that fell into that interval). One other way of displaying FD info clearly and simply, is to slightly modify a histogram by removing the vertical y axis. The x axis remains showing the different scores, but instead of drawing a bar above each score, each participant is represented by a block. These blocks are the positioned above a score, stacked on top of each other when the score frequency is over 1. The bottom left hand graph is an example of this kind of graph, which uses the same data from the symbol recall example (so shows exactly the same info as the top graph): there are 20 blocks in total all placed over the individual’s score. 4 people remembered 6 symbols, so you can see 4 blocks stacked over the score. These types of FD graphs show the distribution of scores clearly.

8 Frequency Distribution Polygons
Shows same information with lines: traces ‘shape’ of distribution Both histograms and polygons represent continuous data For non numerical data, frequency distribution can be represented by bar graphs Bar graphs have spaces between adjacent bars to represent distinct categories An alternative to histograms is a graph that does not show bars but lines. FD polygons show exactly the same info as histograms, the axis are arranged exactly the same way, but dots are used to represent both score and frequency. The vertical position of the dot represents frequency, while the horizontal position represents the score. The dots are then connected by a line, which traces the shape of the distribution. The top graph is an example of a polygon (again showing the example data form remembering symbols): if we look at this we can see that the most common score was 6 symbols, the dot for this is exactly above 6 on the X axis and exactly across from 4 on the y axis. we can clearly see how many remembered each number of symbols within the data set range. A FD polygon can be used for frequency data organised by class intervals, just as with the histograms: this is done by placing the dot exactly in the centre of the class interval range. The frequencies of non numerical data (e.g. nominal data) can also be represented on a graph: by using a bar chart. BC is basically the same as a histogram except there are spaces between the bas: for nominal data these spaces emphasise that these are separate distinct categories. The bottom graph shows an example of a BC. Say for example we classified our 20 participants who remembered symbols on the basis of whether they were better at remembering p numbers, historical dates, or family related dates. This would be the sort of info that can be represented on a BC, the X axis shows the different categories, with the height of the bars showing the frequency.

9 Frequencies of Populations and Samples
All the individuals of interest to the study Sample The particular group of participants you are testing: selected from the population Although it is possible to have graphs of population distributions, unlike graphs of sample distributions, exact frequencies are not normally possible. However, you can Display graphs of relative frequencies (categorical data) Use smooth curves to indicate relative frequencies (interval or ratio data) So far I have just talked about FD graphs in relation to the example data I presented. Suppose you wanted to construct a graph showing the FD of a population rather than a sample. At the beginning of the lecture I mentioned the differences between samples and populations: and FD graphs are one area where these differences matter. It is perfectly possible to construct any FD graph for population data: a histogram, or polygon as I just talked about, the principles would be exactly the same. The differences lie in the data available for a pop graph: in a sample data set all scores are available, while you may not have access to all the individuals in a population, and so will probably not have scores for the entire population (e.g. a symbol recall data: the target population there may be every adult with a normally functioning memory, not possible to obtain). You may know certain characteristics about a population but it is v rare that you will have all the scores available. You may still draw a graph describing the FD of a population though, you can either: Use relative frequencies: for categorical (nominal) data you may not know exact numbers, but may know enough to talk about relative frequencies in populations (if you remember for the example data we categorised people into those who could remember p numbers best, those who could remember historical dates best +family dates: we might know that in the general population people are twice as good at remembering family b days than historical dates: you could draw a bar chart representing relative frequency: so it does not matter what frequency value you assign family dates so long as the bar above it is twice as tall as the bar above historical dates). For populations with interval or ratio data, you could still use a FD graph: if you remember the FD polygon traced the shape of the sample data distribution with a point for each score. For population data you can still use the polygon, but because not all the scores (or points) are known: a smooth curve is drawn to show that these are relative frequencies rather than known scores.

10 Let’s try it….. Using SPSS sample data…


Download ppt "Week 4 Frequencies."

Similar presentations


Ads by Google