Fundamental Graphics in R

Slides:



Advertisements
Similar presentations
STATISTICS Joint and Conditional Distributions
Advertisements

STATISTICS Sampling and Sampling Distributions
RSLAB-NTU Lab for Remote Sensing Hydrology and Spatial Modeling 1 An Introduction to R Pseudo Random Number Generation (PRNG) Prof. Ke-Sheng Cheng Dept.
STATISTICS Joint and Conditional Distributions
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Hypotheses Test.
R_SimuSTAT_1 Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
R_SimuSTAT_2 Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
STATISTICS Random Variables and Distribution Functions
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Measures of Dispersion boxplots. RANGE difference between highest and lowest value; gives us some idea of how much variation there is in the categories.
Measures of Position - Quartiles
Programming in R Describing Univariate and Multivariate data.
Describing distributions with numbers
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Fundamental Graphics in R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University.
Boxplots The boxplot is an informative way of displaying the distribution of a numerical variable.. It uses the five-figure summary: minimum, lower quartile,
STOCHASTIC HYDROLOGY Stochastic Simulation (I) Univariate simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
STATISTICS Univariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS Joint and Conditional Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STOCHASTIC HYDROLOGY Stochastic Simulation of Bivariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Stochastic Hydrology Random Field Simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Exploratory Data Analysis
STATISTICS Exploratory Data Analysis and Probability
5-Number Summaries, Outliers, and Boxplots
CHAPTER 1 Exploring Data
Chapter 16: Exploratory data analysis: numerical summaries
BAE 6520 Applied Environmental Statistics
Describing Distributions Numerically
STATISTICS HYPOTHESES TEST (I)
BAE 5333 Applied Water Resources Statistics
STATISTICS POINT ESTIMATION
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
STATISTICS Joint and Conditional Distributions
STOCHASTIC HYDROLOGY Stochastic Simulation (I) Univariate simulation
STATISTICS Random Variables and Distribution Functions
STATISTICS Univariate Distributions
Description of Data (Summary and Variability measures)
CHAPTER 1 Exploring Data
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Unit 4 Statistics Review
Stochastic Hydrology Random Field Simulation
STATISTICS INTERVAL ESTIMATION
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Range between the quartiles. Q3 – Q1
Stochastic Storm Rainfall Simulation
Mean As A Balancing Point
Good research questions
STOCHASTIC HYDROLOGY Random Processes
Displaying and Summarizing Quantitative Data
Describing Quantitative Data with Numbers
Measures of Central Tendency
Statistics Fractiles
Day 52 – Box-and-Whisker.
Describing Distributions Numerically
Mean As A Balancing Point
. . Box and Whisker Measures of Variation Measures of Variation 8 12
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
STATISTICS Exploratory Data Analysis and Probability
Box and Whisker Plots.
Advanced Algebra Unit 1 Vocabulary
Section 12.3 Box-and-Whisker Plots
Box and Whisker Plots and the 5 number summary
Stochastic Simulation and Frequency Analysis of the Concurrent Occurrences of Multi-site Extreme Rainfalls Prof. Ke-Sheng Cheng Department of Bioenvironmental.
Professor Ke-sheng Cheng
STATISTICS HYPOTHESES TEST (I)
Professor Ke-Sheng Cheng
Presentation transcript:

Fundamental Graphics in R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Histogram hist(x, freq=FALSE, breaks=…) breaks= freq= a vector giving the breakpoints between histogram cells, a single number giving the number of cells for the histogram. freq= If TRUE, the histogram graphic is a representation of frequencies, the counts component of the result, if FALSE, probability densities, component density, are plotted (so that the histogram has a total area of one). 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Empirical CDF, ecdf 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

The ecdf in R is a function. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Sample quantiles Linear interpolation 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Using the quantile function to calculate sample quantiles 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Not linear interpolation! These three numbers define the box. Whiskers are defined differently. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Lower and upper hinges The lower hinge is the median of the lower half of the data, and the upper hinge the median of the upper half of the data. When the number of data points, say n, is even, there are (n/2) data points in the lower and upper halves. When n is odd, there are (n+1)/2 data points in the lower and upper halves. The median is considered as a data point in both the lower and upper halves. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Box-and-whisker plot (or boxplot) A box-and-whisker plot includes two major parts – the box and the whiskers. A parameter range determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the interquartile range (IQR) from the box. A value of zero causes the whiskers to extend to the data extremes. Outliers are marked by points which fall beyond the whiskers. Hinges and the five-number summary 12/9/2018 Lab for Remote Sensing Hydrology and Spatial Modeling Department of Bioenvironmental Systems Engineering, National Taiwan University

12/9/2018 Lab for Remote Sensing Hydrology and Spatial Modeling Department of Bioenvironmental Systems Engineering, National Taiwan University

In R, a boxplot is essentially a graphical representation determined by the 5NS. Not “linear interpolation” The summary function in R yields a list of six numbers: 12/9/2018 Lab for Remote Sensing Hydrology and Spatial Modeling Department of Bioenvironmental Systems Engineering, National Taiwan University

Box-and-whisker plot of X 12/9/2018 Lab for Remote Sensing Hydrology and Spatial Modeling Department of Bioenvironmental Systems Engineering, National Taiwan University

Seasonal variation of average monthly rainfalls in CDZ, Myanmar Boxplots are based on average monthly rainfalls of 54 rainfall stations. 12/9/2018 Lab for Remote Sensing Hydrology and Spatial Modeling Department of Bioenvironmental Systems Engineering, National Taiwan University

Barplots Barplots are useful in summarizing categorical data. A barplot can also be used to present the frequency of the values of a variable in certain levels. Barplots can also be used to show time series data. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

The QQ plots The quantile-quantile plots are a type of scatter plot used to compare distributions of two groups or to compare a sample with a reference distribution. When the groups are of different sizes, R reduces the size of the larger group to the size of the smaller one by keeping the minimum and maximum values, and choosing equally spaced quantiles between. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

The boxplot in R boxplot(x,range=0) boxplot(x) [Default, range=1.5] A box-and-whisker plot includes two major parts – the box and the whiskers. The parameter range determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Comparison of multiple boxplots Can also use boxplot(x1,x2,x3,names=c(“x1”, ”x2”, ”x3”)) 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Comparison of multiple boxplots 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Low-level graphics In R graphics, the display is divided into: The plot region where data will be drawn Four margin areas, numbered clockwise from 1 to 4, starting at the bottom. After establishing the plot region and margins, we can start adding points, lines, polygons, and symbols to the plot region. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Graphic components points(x,y, …) lines(x,y, …) text(x,y, labels, …) abline(a,b, …) # adds the line y=a+bx abline(h=y, …) abline(v=x, …) polygon(x,y, …) segments(x0,y0,x1,y1, …) arrows(x0,y0,x1,y1, …) symbols(x,y, …) legend(x,y, legend, …) 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Multiple Scatter Plots 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Saving Graphs to Files The R graphics display can consists of various graphics devices. The default device is the screen. However, it is also possible to save a graph to a file by assigning other graphics devices. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Assigning R graphics device – a pdf file Use > jpg(“filename.jpg”), if you want to save a graph to a jpeg file. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Random number generation in R R commands for stochastic simulation (for normal distribution pnorm – cumulative probability qnorm – quantile function rnorm – generating a random sample of a specific sample size dnorm – probability density function For other distributions, simply change the distribution names. For examples, (punif, qunif, runif, and dunif) for uniform distribution and (ppois, qpois, rpois, and dpois) for Poisson distribution. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Approximation of the Poisson distribution by normal distribution Demonstration using stochastic simulation Using R . Estimated by normal approximation of Poisson distribution 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Poisson CDF by stochastic simulation Estimated by stochastic simulation of Poisson distribution Poisson CDF by stochastic simulation Direct calculation using theoretical CDF of Poisson distribution. 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Approximation by normal distribution Poisson CDF by stochastic simulation 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

3-D Graphics 12/9/2018 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.