Graphics in S-Plus Jagdish S. Gangolly School of Business

Slides:



Advertisements
Similar presentations
Stem and leaf diagrams and box plots Statistics. Draw a stem and leaf diagram using the set of data below
Advertisements

Descriptive Exploratory Data Analysis 9/6/2007 Jagdish S. Gangolly State University of New York at Albany.
Descriptive Exploratory Data Analysis 9/6/2007 Jagdish S. Gangolly State University of New York at Albany.
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Percentiles Def: The kth percentile is the value such that at least k% of the measurements are less than or equal to the value. I.E. k% of the measurements.
Statistics: Use Graphs to Show Data Box Plots.
Starter 1.Find the median of Find the median of Calculate the range of Calculate the mode.
The Five-Number Summary And Boxplots. Chapter 3 – Section 5 ●Learning objectives  Compute the five-number summary  Draw and interpret boxplots 1 2.
Box and Whisker Plots A Modern View of the Data. History Lesson In 1977, John Tukey published an efficient method for displaying a five-number data summary.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Descriptive Exploratory Data Analysis III Jagdish S. Gangolly State University of New York at Albany.
Warm Up – Find the mean, median & mode of each set. Data Set I Data Set II.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Continued… Obj: draw Box-and-whisker plots representing a set of data Do now: use your calculator to find the mean for 85, 18, 87, 100, 27, 34, 93, 52,
1 Further Maths Chapter 2 Summarising Numerical Data.
BOX PLOTS (BOX AND WHISKERS). Boxplot A graph of a set of data obtained by drawing a horizontal line from the minimum to maximum values with quartiles.
Quantitative data. mean median mode range  average add all of the numbers and divide by the number of numbers you have  the middle number when the numbers.
Box and Whisker Plots. Introduction: Five-number Summary Minimum Value (smallest number) Lower Quartile (LQ) Median (middle number) Upper Quartile (UP)
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Types of Graphs.
Texas Algebra I Unit 3: Probability/Statistics Lesson 28: Box and Whiskers plots.
Vocabulary to know: *statistics *data *outlier *mean *median *mode * range.
Box Plots March 20, th grade. What is a box plot? Box plots are used to represent data that is measured and divided into four equal parts. These.
Foundations of Math I: Unit 3 - Statistics Arithmetic average Median: Middle of the data listed in ascending order (use if there is an outlier) Mode: Most.
What is a box-and-whisker plot? 5-number summary Quartile 1 st, 2 nd, and 3 rd quartiles Interquartile Range Outliers.
Descriptive Exploratory Data Analysis II Jagdish S. Gangolly State University of New York at Albany.
5-Number Summary A 5-Number Summary is composed of the minimum, the lower quartile (Q1), the median (Q2), the upper quartile (Q3), and the maximum. These.
Box and Whisker Plots Example: Comparing two samples.
Descriptive Exploratory Data Analysis II
Box and Whisker Plots or Boxplots
Chapter 16: Exploratory data analysis: numerical summaries
Box and Whisker Plots and the 5 number summary
Box and Whisker Plots and the 5 number summary
Stem and leaf diagrams and box plots
Chapter 5 : Describing Distributions Numerically I
Fundamentals of Probability and Statistics
Box Plots EQ: How do you analyze box plots?
Bar graphs are used to compare things between different groups
Introduction to Exploratory Descriptive Data Analysis in S-Plus II
Box and Whisker Plots Algebra 2.
2.6: Boxplots CHS Statistics
A Modern View of the Data
Numerical Measures: Skewness and Location
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Measure of Center And Boxplot’s.
Box and Whisker Plots 50% Step 1 – Order the series.
BOX-and-WHISKER PLOT (Box Plot)
Measure of Center And Boxplot’s.
Mean As A Balancing Point
Box and Whisker Plots.
Measures of Central Tendency
Constructing Box Plots
Introduction to Exploratory Descriptive Data Analysis in S-Plus
Define the following words in your own definition
Mean As A Balancing Point
12.4 Box-and-Whisker Plots
Comparing Statistical Data
1-4 Quartiles, Percentiles and Box Plots
Box-and-Whisker Plots
Box and Whisker Plots and the 5 number summary
Box and Whisker Plots and the 5 number summary
Box and Whisker Plots.
Box and Whisker Plots and the 5 number summary
Quiz.
Box and Whisker Plots and the 5 number summary
Find the Mean of the following numbers.
BOX-and-WHISKER PLOT (Box Plot)
Ch. 12 Vocabulary 15.) quartile 16.) Interquartile range
Dot Plot A _________ is a type of graphic display used to compare frequency counts within categories or groups using dots. __________ help you see how.
Presentation transcript:

Graphics in S-Plus Jagdish S. Gangolly School of Business State University of New York at Albany 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Trellis Graphics I A matrix of graphs Example: >par(mfrow=c(2,2)) # 2 X 2 matrix of figures >x <- 1:100/100:1 >plot(x) # plot cell (1,1) >plot(x, type=“l”) # plot cell (1,2) line >hist(x) # plot cell (2,1) histogram >boxplot(x) # plot cell (2,2) boxplot 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Trellis Graphics II Syntax: Dependent variable ~ explanatory variable |conditioning variable, Data set 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Trellis Graphics III Example: histogram(~height | voice.part, data=singer) No dependent variable for histogram 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Trellis Graphics IV: Singer Data Height is explanatory variable Data set is singer 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Trellis Graphics V: Barley data 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Trellis Graphics VI: Sunspots v. Time 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Trellis Graphics VI: CO2 levels 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Trellis Graphics VII: Scatterplots 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Graphics I: Summary > summary(barley) yield variety year site Min.:14.43333 Svansota,No.462:24 1932:60 Grand Rapids:20 1st Qu.:26.87500 Manchuria:12 1931:60 Duluth:20 Median:32.86667 No. 475:12 University Farm:20 Mean:34.42056 Velvet,Peatland:24 Morris:20 3rd Qu.:41.40000 Glabron:12 Crookston:20 Max.:65.76670 No. 457:12 Waseca:20 Wisconsin No. 38,Trebi:24 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics II: Stem and Leaf Display attach(barley) > stem(yield) N = 120 Median = 32.86667 Quartiles = 26.85, 41.46666 Decimal point is 1 place to the right of the colon 1 : 4 1 : 579 2 : 0011122223333 2 : 555666666667777777889999999 3 : 0000001112222223333344444 3 : 5555667777888899 4 : 000112223334444 4 : 567777779999 5 : 00 5 : 5889 6 : 4 6 : 6 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics III: Quantiles > quantile(yield, seq(0.1, 0.9, by=0.1)) 10% 20% 30% 40% 50% 60% 70% 80% 90% 22.49667 26.08 28.09 29.94667 32.86667 35.13333 38.97333 43.32 47.45666 > > quantile(yield, c(0.25, 0.5, 0.75)) 25% 50% 75% 26.875 32.86667 41.4 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Graphics IV: tapply > tapply(yield, list(site, year), mean) 1932 1931 Grand Rapids 20.81000 29.05334 Duluth 25.70000 30.29333 University Farm 29.50667 35.82667 Morris 41.51333 29.28667 Crookston 31.18000 43.66000 Waseca 41.87000 54.34667 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Graphics V: by by(barley, year, summary) year:1932 yield variety year site Min.:14.43333 Svansota,No. 462:12 1932:60 Grand Rapids:10 1st Qu.:25.48334 Manchuria: 6 1931: 0 Duluth:10 Median:30.98334 No. 475: 6 University Farm:10 Mean:31.76333 Velvet,Peatland:12 Morris:10 3rd Qu.:37.80000 Glabron: 6 Crookston:10 Max.:58.16667 No. 457: 6 Waseca:10 Wisconsin No. 38,Trebi:12 ------------------------------------------------------------------------------------------------- year:1931 yield variety year site Min.:19.70000 Svansota,No. 462:12 1932: 0 Grand Rapids:10 1st Qu.:29.09166 Manchuria: 6 1931:60 Duluth:10 Median:34.20000 No. 475: 6 University Farm:10 Mean:37.07778 Velvet,Peatland:12 Morris:10 3rd Qu.:43.85000 Glabron: 6 Crookston:10 Max.:65.76670 No. 457: 6 Waseca:10 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics VI: histogram > histogram(~yield) 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics VII: histogram in trellis > histogram(~yield | site) 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics VIII: Box-and-Whiskers Plot The boxplot is interpreted as follows: The box itself contains the middle 50% of the data. The upper edge (hinge) of the box indicates the 75th percentile of the data set, and the lower hinge indicates the 25th percentile. The range of the middle two quartiles is known as the inter-quartile range. The line in the box indicates the median value of the data. If the median line within the box is not equidistant from the hinges, then the data is skewed. The ends of the vertical lines or "whiskers" indicate the minimum and maximum data values, unless outliers are present in which case the whiskers extend to a maximum of 1.5 times the inter-quartile range. The points outside the ends of the whiskers are outliers or suspected outliers. 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics IX: Box-and-Whiskers Plot bwplot(site~yield | year, data=barley) 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Graphics X: Box-and-Whiskers Plot in Trellis Graphs bwplot(year~yield | site, data=barley) 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)

Acc 522 Statistical Methods for Business Decisions (J Gangolly) Graphics X: Persp > attach(geyser) > persp(hist2d(waiting, duration)) 12/4/2018 Acc 522 Statistical Methods for Business Decisions (J Gangolly)