Download presentation
Presentation is loading. Please wait.
Published byNathan Curtis Modified over 9 years ago
1
Statistics for Social Sciences I (E563) Statistics for Social Sciences I (E563) Statistics for Social Sciences I (E563) Statistics for Social Sciences I (E563) Prof. Sudip Ranjan Basu, Ph.D 25 September 2008
2
Lecture 2-Sudip R. Basu2 « A statistical tie » Think about these bar diagrams…
3
Lecture 2-Sudip R. Basu3 Measurement in Statistics Concepts of measurement: Measurement: a very specific process to assigning number to a variable –Assignment by category (categorical/qualitative-attributes) –Assignment by amount »assignment of a person to a particular category or a variable –Validity: to describe the objective and accurately reflect the concept to measure by a particular scale or index –Face validity/Content validity/Criterion validity/Construct validity –Reliability: to have consistency of the data collected likelihood that the scale is actually measuring what it is supposed to measure Free of measurement errors –Split-half reliability/test-retest reliability
4
Lecture 2-Sudip R. Basu4 Forms of ‘variable’ Variables: Concepts that vary, or change, from one observation to another in a sample or population Measurement scale differs Different statistical methods to apply to Quantitative and Qualitative variables Variable Quantitative: measurement scale has numerical values, imply amounts -annual income Categorical/ Qualitative: measurement scale is a set of categories, not imply amounts -marital status
5
Lecture 2-Sudip R. Basu5 Sales of measurement Quantitative variable: Interval scale Annual income (chf 50 and chf 30= chf 20) Qualitative variable: Unordered/nominal scale Primary mode of transportation (Bus, tram, bicycle, walk) Qualitative variable: Ordered/ordinal scale Involves a rank order or other ordering Political philosophy (Liberal, moderate. conservative)
6
Lecture 2-Sudip R. Basu6 Quantitative aspects of ordinal data Interval scale: –Class interval: An interval that indicates the space between two end points –Qualitative vary in magnitude Nominal scale: –Qualitative vary in quality not in quantity Ordinal scale: –quantitative-qualitative vary in quality not in quantity –Each level has a greater or smaller magnitude –Numerical scale by assigning numerical scores to categories –Interval than nominal –Sensitivity analysis
7
Lecture 2-Sudip R. Basu7 Discrete and Continuous Discrete: A set of values form separate numbers, such as 0,1,2,…. Unit of measurement cannot be subdivided »Number of siblings »Number of visits to a physician last year Categorical variables-nominal or ordinal Quantitative variables-discrete (Number of siblings) or continuous (age) Continuous: An infinite continuum of possible real number values Any real number possible between two values »Height »Weight
8
Lecture 2-Sudip R. Basu8 Summarize types of variables
9
Lecture 2-Sudip R. Basu9 Describing data Categorical data: –Frequency : headcounts or tallies indicating the number of cases in particular category or the total number of cases measured/the number of observations –Scores: Numbers that are used to represent amounts or rankings –Relative frequency The proportion (# of observations in a category divided by the total number of observations) or percentage (proportion multiplied by 100) of the observations that fall in that category Sum of proportions equals to 1.00 –Frequency distribution A tabulation that lists possible values for a variable, together with the number of observations at each level. –Relative frequency distribution A listing of possible values together with their proportions or percentages Quantitative data: –Frequency distribution Intervals of values in frequency distributions are usually of equal width Mutually exclusive intervals
10
Lecture 2-Sudip R. Basu10 Bar graphs
11
Lecture 2-Sudip R. Basu11 Comparing groups Compare: Same variable and different groups Relative frequency distributions Histograms Stem-and-leaf plots
12
Lecture 2-Sudip R. Basu12 Population and sample distribution Sample distribution is a ‘blurry’ picture of the population distribution –As the sample size increases, the sample proportion in any interval gets closer to the true population proportion Sample distribution population distribution
13
Lecture 2-Sudip R. Basu13 Shape of a distribution Shapes of distributions differ Symmetric Skewed
14
Lecture 2-Sudip R. Basu14 SESSION 2 of Lecture 2
15
Lecture 2-Sudip R. Basu15 Working with STATA stata@stata.com http://www.ststa.com stata@stata.com http://www.ststa.com
16
Lecture 2-Sudip R. Basu16 Getting started with STATA The first four windows open automatically after clicking STATA icon: The most visible window is the Results Window, which shows results from commands you have typed in the Command Window. The Command Window is below Results Window where all your commands are typed. The Review Window lists all typed commands that have been entered from the Command Window. When you click on a command from Review Window, it is pasted into the Command Window. The Variables Window lists all working variables in the file. Once you click on a variable, and it will appear in the command window.
17
Lecture 2-Sudip R. Basu17 STATA window
18
Lecture 2-Sudip R. Basu18 Simpel commands The data editor allows you to enter, view, or edit your working data file. Caution: This window must be closed in order to run commands in STATA. The do-file editor allows you to write, edit, and save STATA commands. STATA commands can be run from the do-file editor. -- files are called do files because they have the file extension.do Note: STATA treats lines that begin with an asterisk * or text between a pair of /* and */ as comments.
19
Lecture 2-Sudip R. Basu19 Save-Close files Open/Save/Close data file using the icons at the top of the screen-“file” or via commands in the Command Window. The STATA dataset is saved in the.dta format. You can use a separate programme called Stat Transfer to translate the dataset from its current format into STATA format. For large dataset, researchers prefer to use this program. This program retains any variable or value labels from the original file.
20
Lecture 2-Sudip R. Basu20 Help-Search Memory allows you to handle a large datasets. For example, you can set a memory size of 20m by the following command in the Command Window..set memory 20m Help/Search facilities in the STATA allow looking for any command. You can use the help command by simply typing help in the Command Window or using the drop-down Help menu icon, which will open a separate window. You can also type findit command for more information. However, if you do not know the STATA command name you can use the Search facility using the drop-down Help menu icon. For example, if you want help with describe, then you type:.help describe STATA programme uses simple language syntax. Almost all commands follow the structure:.command variable (variable variable…), options
21
Lecture 2-Sudip R. Basu21 Creating a new dataset The easy way to create a dataset is to type values for each variable, in columns that STATA automatically calls var1, var2, etc in the Data Editor. Thus, var1 contains names of students; var1 statistics competency; and so forth. Rename:. rename var1 students.label variable students “Students in Statistics, 2008-2009” After typing in the information, you close the window and save data, say. stat2.dta. save stat2
22
Lecture 2-Sudip R. Basu22 Working with Sample Specifying Subsets of the data: You can restrict to a subset of the data by adding an in or if qualifier, such as using only the 1st through 20th observation, type.list in 1/25.sort origin.list origin program in 1/25 The if qualifier also has broad applications, but it selects observations based on specific variable values, such as.summarize if stat==1
23
Lecture 2-Sudip R. Basu23 Describing data Frequency Tables and Two-Way Cross Tabulations: You can work on Categorical variables for tabulation. Use the dataset stat to tabulate the categorical variable programme:.tabulate programme You can do cross-tabulation of programme by stat:.tabulate programme stat You can get column percentages, type.tabulate programme stat, column
24
Lecture 2-Sudip R. Basu24 Data tabulation Multiple Tables and Multi-way Cross-Tabulations: You can work on many different variables, type.tab1 origin programme stat.tab1 programme – education You can get multiple two-way tables, such as cross-tabulations of every two-way combinations of the listed variables, type.tab2 origin programme stat To produce multi-way tables, if we do not need percentages or statistical tests, type.table programme, contents (freq) To produce two-way frequency table or cross-tabulation, type. table origin programme, contents (freq) To produce a more complicated tables, type. table origin programme, contents (freq) by (stat)
25
Lecture 2-Sudip R. Basu25 GRAPHS with STATA You can draw bar charts, type:.graph bar stat, over (programme) blabel(bar) bar (1, bcolor(gs10)).graph bar stat, over(programme) legend( label(1 "Frequency")) ytitle("Native Language Speakers") title("Bar diagram of native language speakers, E563") subtitle("by languages") note("Source: Statistics Class 1, SRBasu").graph bar stat word, over (programme) blabel(bar) bar (1, bcolor(gs10)) bar (2, bcolor (gs7)) You can draw horizontal bar charts, type:.graph hbar stat, over (programme) blabel(bar) bar (1, bcolor(gs10)).graph hbar stat word, over (programme) blabel(bar)
26
Lecture 2-Sudip R. Basu26 Working with datasets See Week 2 web-course material 1)Assignment_1 Datasets: 2) Week2_Students Profile 3) Week2_World Socio-economic data
27
Lecture 2-Sudip R. Basu27 Week 3-2 October Descriptive Statistics »Measures of Central Tendency and Dispersion, Moments, Skewness, and Kurtosis Readings: »AF-Chapter 3 (p.39-60) »MS-Chapter 4, MS-Chapter 5 Assignment: Assignment 2 »Students should turn in his/her own paper in hardcopies to teaching assistant at Rigot Office No. 31 or in class on Thursday 9 October-Week 4. Note
28
Lecture 2-Sudip R. Basu28
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.