Download presentation
Presentation is loading. Please wait.
1
MBA 7025 Statistical Business Analysis Displaying Data – Charts & Graphs Jan 20, 2015
2
Displaying Data – Charts & Graphs
Agenda Basic Concepts Displaying Data – Charts & Graphs
3
Basic Concepts in Data Analysis
Data, Information, and Knowledge Populations and Samples Variables and Observations Types of Data: Categorical and Numerical Types of Data: Cross Sectional and Time Ordered
4
Data, Information, and Knowledge
Data are building blocks of information. These are observations on entities (observation units). Variables are used to measure observations. Information is processed data (organized, summarized, analyzed and filtered) that are made meaningful and relevant to the situation/phenomenon being understood. Knowledge is the ability to apply/use information to decision situations. Meaning associated with information is knowledge …. Actionable Information! Processing Analysis Reports Application Meaning Relevance
5
Populations and Samples
Statistical Inference Sample:: Subset of collection of all possible entities (observation units) Data on sample is what is available. KNOWN Statistics are used to describe samples. These can vary across samples. Population: : Collection of all possible entities (observation units) Data on the whole population is usually not available. UNKNOWN Parameters are used to describe populations. These are constants for a population. Statistical Inference is the art and science of drawing inferences/ conclusions about a population of interest. Statistical Inference is the process by which a characteristics/aspects of a population are understood (known). Conclusions about the population are drawn (inferred) based in the knowledge gained from the sample. A sample should be a good representation of the population.
6
Variables and Observations
Entity Height (inches) Weight (pounds) Age (years) Sex (Category) Person 1 Person 2 Person 3 * 67 61 72 170 120 220 33 38 62 Male Female O B S E R V A T I O N S Variables are characteristics (aspects) of entities that are different for different entities. Observations on an entity are values of these characteristics that have been measured. So, a dataset is a collection of observations on a group (sample) of entities. Each row is an observation on a particular entity. Each column is an aspect or characteristic of individual entities (measured as variables). Measurement
7
Types of Data: Categorical and Numerical
We can do arithmetic on numerical data (age and salary). These data are actual measurements. Categorical data is qualitative. Sometimes qualitative data is coded. For example, opinion can be coded 1-5 and arithmetic (calculations) can be performed. Such data is ordinal (has implied order). State is a categorical variable and cannot be used for calculations. Such data are nominal. Categorical Numerical
8
Types of Data: Cross-sectional and Time Ordered
Questions What was the absenteeism at Plant 1 in Jan. 2008? Was the annual absenteeism the same for all plants? Was absenteeism stable at plant 1 during 2008?
9
Displaying Data – Charts & Graphs
Agenda Basic Concepts Displaying Data – Charts & Graphs
10
Frequency Tables A Frequency Table showing a classification of the AGE of attendees at an event. Class Frequency 10 but under 20 but under 30 but under 40 but under 50 but under Total Relative Frequency Percentage Class is a range for the values of a variable. Frequency is the number of observations associated with a class. Relative Frequency is the proportion of observations (frequency) associated with a class.
11
Frequency Histograms A graphical display of distribution of frequencies How about cumulative frequency? When do we apply cumulative frequency?
12
Developing Frequency Tables and Histograms
Sort Raw Data in Ascending Order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find Range: = 46 Select Number of Classes: 5 (usually between 5 and 15) Compute Class Interval (width): 10 (range/classes = 46/5 then round up Determine Class Boundaries (limits): 10, 20, 30, 40, 50 Compute Class Midpoints: 15, 25, 35, 45, 55 Count Observations & Assign to Classes
13
Bar and Pie Charts Displaying Categorical Data Stocks 46.5 42.27
CD 14% Investment Amount Percentage Category (in thousands $) Stocks Bonds CD Savings Total Savings 15% Stocks 42% Bonds 29%
14
Side by Side Chart Categorical Bivariate Data: Side-by-Side Charts
15
Scatter Plot for bivariate numerical data
Shows relationship between two variables. Can one be used to predict the other? Time-Series and Regression Analysis are used to predict one variable’s value based on the other. Correlation analyses is used to measure the strength of linear relationship among two variables.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.