Presentation is loading. Please wait.

Presentation is loading. Please wait.

R: Packages and Data Retrieval

Similar presentations


Presentation on theme: "R: Packages and Data Retrieval"— Presentation transcript:

1 R: Packages and Data Retrieval
Hydroinformatics – Fall 2016

2 Learning Objectives Describe the difference between packages and libraries in R Install and load packages Use documentation and other resources to learn how to use unfamiliar packages Use data retrieval packages and web services to obtain hydrologic data Use packages to interact with databases

3 R Packages and Libraries
Package: R functions, data, and compiled code in a well- defined format. Library: The directory where packages are stored.  Implement many common data analysis and statistical procedures Provide excellent graphics functionality Serve as a starting point for many data analysis tasks A huge community of R developers exist – it’s likely that there’s an R package for many of the tasks you commonly do

4 Installing and loading Packages
library() #displays available packages Tools>Install Packages or install.packages(“package_name”) library(package_name) #loads package

5 USGS Data Retrieval Package for R
Collection of functions to help retrieve hydrologic and water quality data using web services: U.S. Geological Survey (USGS) using National Water Information System (NWIS) tools U.S. Environmental Protection Agency (EPA) Data access is through web services Try installing and loading the dataRetrieval package

6 Demo – dataRetrieval We want to automate the retrieval and import of data from a specific site into R using the dataRetrieval package. But how? Use the source, Luke!

7 Demo – dataRetrieval for Gage Height
Try: Retrieve USGS gage height data (parameter code: 00065) for the site “ ” for May 2014 using the readNWISdata function. This package is well documented and its examples are very useful for modifying/tailoring to specific needs. For example, this code uses “service” (look at the examples and see that it is “iv” for one readNWISdata() and it is “site” for another. What do these terms mean, and where can you find out about them?

8 instGage <- readNWISdata(sites=" ", service="iv",parameterCd="00065", startDate=' T00:00Z',endDate=' T00:00Z')

9 WaterML Package for R WaterML is a standard information model for the representation of water observations data, with the intent of allowing the exchange of such data sets across information systems. retrieve and analyze data from HydroServers of multiple organizations that are listed in the CUAHSI Water Data Center catalog  great example of semester project

10 Demo - WaterML In partners, follow the tutorial to retrieve data and fit a linear model between two parameters (dissolved oxygen and temperature) pdf Post your team’s results after step 6 to the google doc: *make sure to specify the column (Temp$DataValue) *label the axes, and use different colors or markers *put your team member’s names by your plot

11 RMySQL Package

12 RMySQL Package Note: You need to actually set the password!
dbDriver tells it which type of Database management system you are working with

13 Interacting with Your Database
Try: Listing all available tables in the database Hint 1: Look up functions ls("package:RMySQL") Hint 2: Use help(“function_name”) to get arguments This code was modified from the sql script that we used to create the tables in the LoganRiverODM database.

14 Interacting with Your Database
What does the argument “n=-1” mean? No need to parse the data – the Fetch function puts the queried data directly into an R data.frame format! Note: What is the n=-1? (Look at the help(dbFetch) for more information on the syntax) No need to parse data – it automatically assigns column names

15 Homework: Use one of the data retrieval packages to obtain a time-series of streamflow data (any parameter that you are interested in) from any site in Utah, and then use the RMySQL package to obtain a time-series of the same parameter for a different site from your LoganRiverODM database. Create a plot and do some basic comparisons/statistical summaries of the datasets. How does the documentation style of RMySQL compare to the USGS dataRetrieval Package? to the WaterML package? What is most useful? What is the most confusing thing about using a new package? How does the creator of a package communicate most effectively with new users? What other tools do you have (besides the GitHub page) to help you learn how to use a package?


Download ppt "R: Packages and Data Retrieval"

Similar presentations


Ads by Google