Download presentation
Presentation is loading. Please wait.
Published byIsabel Summers Modified over 8 years ago
1
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National Science Foundation Grants EPS 1135482 and EPS 1208732
2
Objectives Discover and access data from major data sources Write and execute computer code to automate difficult and repetitive data related tasks Manipulate and analyze data using code Retrieve and use data from Web services Create reproducible data visualizations
3
Exercise Download daily mean discharge values for the USGS gage in the Logan River above State Dam, near Logan, UT (USGS 10109000) for the past 10 years (2003-10-01 through 2013-09-30) Create a time series plot of the data (e.g., in Excel) Calculate overall summary statistics (min, max, mean)
4
R Demo Retrieve the USGS discharge data using R Apply the USGS dataRetrieval package Create a plot and summary statistics using R code
5
USGS Data Retrieval Package for R https://github.com/USGS-R/dataRetrieval Obtain streamflow and water quality sample data from the USGS National Water Information System (NWIS) Data access is through web services – USGS daily discharge data – USGS unit discharge values (15 minute data) – USGS water quality data – EPA STORET water quality data
6
How Does the Magic Work? National Water Information System Oracle http://waterservices.usgs.gov Daily Values Web Service GetDVData! Query Database Format Results as WaterML XML WaterML XML dataRetrieval R package Parse WaterML into R data frame Sweet! Sites Web Service Sites Web Service
7
How do I find sites? http://maps.waterdata.usgs.gov/mapper/index.html
8
USGS Daily Streamflow Values Data Frame NameDefinitionUnits or type DateThe dateyyyy-mm-dd QThe discharge on that datem3/sec JulianThe date expressed as days starting with Jan 1, 1850days MonthMonth of the year, from 1 to 12months DayDay of the year, from 1 to 366days DecYearYear expressed as a decimalyears MonthSeqMonth sequence: an index starting with 1 at Jan, 1850months LogQln(Q)numeric iindex of days from the start of the data framedays Q7Mean discharge for the 7 days, up to day im3/sec Q30Mean discharge for 30 days, up to day im3/sec
9
What is R? R is a programming language and software environment for statistical computing and graphics Wide variety of statistical and graphing techniques Highly extensible Free and Open source http://www.r-project.org
10
R R is an interpreted language and can run interactively – R statements are converted to machine instructions as they are executed – This is flexible, but slower
11
R Packages and Libraries Implement many common data analysis and statistical procedures Provide excellent graphics functionality Serve as a starting point for many data analysis tasks A huge community of R developers exist – it’s likely that there’s an R package for many of the tasks you commonly do
12
R Programming Language R defaults to a graphical user interface the presents users with a prompt for entering code Each input expression is evaluated and then a result is returned
13
R Graphical User Interface
14
Simple Mathematical Expressions in R > 1 + 1 # Simple arithmetic [1] 2 > 2 + 3 * 4 # Operator precedence [1] 14 > exp(1) # Basic mathematical functions are available [1] 2.718282 > sqrt(10) [1] 3.162278
15
Variables in R Numeric – floating point values Boolean (True or False) Strings (character sequences) Types are determined automatically when a variable is created with the assignment “<-” operator
16
Variables in R > a <- 1 # Variables are defined > b <- 30 # Using the “<-” operator to set values > c <- 3.5 > a * b * c [1] 105 > A * b * c # Variable names are case sensitive Error: object 'A' not found
17
Vectors in R A series of numbers Created with – c() to concatenate elements or sub-vectors – rep() to repeat elements or patterns – seq() or m:n to generate sequences Most mathematical functions and operations can be applied to vectors – no looping required!
18
R Vectors > rep(1,10) # Repeats the number 1 - 10 times [1] 1 1 1 1 1 1 1 1 1 1 > seq(1,10) # Sequence of integers between 1 and 10 [1] 1 2 3 4 5 6 7 8 9 10 > seq(5,20,by=5) # Every 5th integer from 5 to 20 [1] 5 10 15 20
19
Vector Operations > x <- c(2,0,0,4) # Creates a vector with elements 2,0,0,4 > y <- c(1,9,9,9) > x + y # Sums elements of 2 vectors [1] 3 9 9 13 > x * 4 # Multiplies elements [1] 8 0 0 16 > sqrt(x) #Function applies to each element [1] 1.414214 0.000000 0.000000 2.000000 # Returns a vector
20
Accessing Vector Elements > x <- c(10,20,30,40,50) # Create a vector called x > x[1] # Select the first element [1] 10 > x[1] <- 300 # Set the value of an element in a vector > x [1] 300 20 30 40 50
21
Data Frames A group of related vectors The equivalent of a table in R Create from scratch using data.frame() > newDataFrame <- data.frame(height=c(150,160), weight=c(65,72)) > newDataFrame height weight 1 150 65 2 160 72
22
Data Frames Read into R from a text file: newDataFrame <- read.table(“table.txt”,header=TRUE) The first line of the file needs to have a name for each column (vector)
23
Accessing Data Frames Multiple ways to retrieve columns of data The following are all equivalent: newDataFrame[“columnName”] newDataFrame[,n] – where n is the column index newDataFrame$columnName
24
Lists Collections of other R objects (e.g., vectors, data frames) Created with list function newList <- list(x = 1, y = 5) Access to components follows rules similar to data frames: newList$x newList[“x”] newList[1]
25
R Workspaces As you create objects in R, they are added to your current workspace Use ls( ) to list your workspace contents Use rm( ) to delete objects from your workspace When you quit R, you can save the current workspace for later use and pick up where you left off
26
Summary R is a general purpose statistical computing environment – it is software and a language R can get data directly from the USGS using a custom package R provides a powerful environment for manipulating, analyzing, and visualizing data Coding analyses in R can make them more reproducible
27
References GitHUB repository with USGS R Tools: https://github.com/USGS-Rhttps://github.com/USGS-R GitHUB repository with USGS dataRetrieval package: https://github.com/USGS- R/dataRetrievalhttps://github.com/USGS- R/dataRetrieval
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.