1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Working In R (with speaker notes)

Slides:



Advertisements
Similar presentations
Mapping Site Instruments. Introduction The Mapped Instrument is a tool that guides you in matching your site specific Instrument and materials to the.
Advertisements

Cross-Tabulation Tables Tables in R and Computing Chi Square.
Data in R. General form of data ID numberSexWeightLengthDiseased… 112m … 256f3.61 NA1… 3……………… 4……………… n91m5.1711… NOTE: A DATASET IS NOT A MATRIX!
Monarch Pro Presented by: Bernadette Coleman Assistant Coordinator of Payroll Alcorn State University September 17, 2012.
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Excel Graphing Tutorial Lauren Ottaviano Fall 2012.
Ch. 2: The Art of Presenting Data Data in raw form are usually not easy to use for decision making. Some type of organization is needed Table and Graph.
Arrays-Part 1. Objectives Declare and initialize a one-dimensional array Store data in a one-dimensional array Display the contents of a one-dimensional.
Microsoft Visual Basic 2005: Reloaded Second Edition Chapter 8 Arrays.
Introduction to Programming with C++ Fourth Edition
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
CHAPTER 1: Picturing Distributions with Graphs
Chapter 6: Pivot Tables Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University Name.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Getting Started with R (with speaker notes)
Concepts of Database Management, Fifth Edition
MICROSOFT ACCESS 2007 BTA – Spring What is Access?  Microsoft Access is a database management system…this means that it contains database information.
Data-mining & Data As we used Excel that has capability to analyze data to find important information, the data-mining helps us to extract information.
1 Microsoft Office 2003: Advanced ADVANCED MICROSOFT WORD Lesson 11 – Merging Form Documents, Directories, Mailing Labels, and Envelopes Microsoft Office.
XP 1 Excel Tables Purpose of tables – Process data in a group – Used to facilitate calculations – Used to enhance readability of output Types of tables.
Introduction to Microsoft Access Overview 1. Introduction What is Access? A relational database management system What is a Relational Database? Organized.
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
DAY 9: EXCEL CHAPTER 6 Tazin Afrin September 17,
Chapter 17 Creating a Database.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.
Finding Information A337/A523. What are some of the possible problems with finding information?
Computer Programming TCP1224 Chapter 11 Arrays. Objectives Using Arrays Declare and initialize a one-dimensional array Manipulate a one-dimensional array.
- Joiner Transformation. Introduction ►Transformations help to transform the source data according to the requirements of target system and it ensures.
Excel Power Pivot With Tom Vorves LinkedIn: tomvorves
Today’s Goals Answer questions about homework and lecture 2 Understand what a query is Understand how to create simple queries using Microsoft Access 2007.
Planning & Creating a Database By Ms. Naira Microsoft Access.
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen R References (with speaker notes)
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Filters, Pivot Table and Charts -Abdul Mohammed. Overview  Data Sorting (Filtering)  Data Summarization  Automatically summarize and sort data(Pivot.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Question 3 RExcel Analysis. 1.Double click on the RExcel2007 icon on your desktop to launch R and Excel.
Visual C# 2005 Using Arrays. Visual C# Objectives Declare an array and assign values to array elements Initialize an array Use subscripts to access.
Excel Introduction to computers. Excel 2007 Starting the Excel program.
Adding Reports to a Database. Why do we use Reports? Reports are well-designed printed pages that offer several advantages: Reports are well-designed.
ACCESS CHAPTER 2 Introduction to ACCESS Learning Objectives: Understand ACCESS icons. Use ACCESS objects, including tables, queries, forms, and reports.
Chapter 9 Introduction to Arrays Fundamentals of Java.
Descriptive Statistics using R. Summary Commands An essential starting point with any set of data is to get an overview of what you are dealing with You.
Pivot Tables (and Pivot Charts) to the Rescue! Dereck Norville, MSSW Hostos Community College – City University of New York DC Data Summit July 7 th, 2014.
Tidy data, wrangling, and pipelines in R
AP CSP: Cleaning Data & Creating Summary Tables
Active Learning Lecture Slides
Using PivotTables.
MS EXCEL PART 4.
Sihua Peng, PhD Shanghai Ocean University
Practical Office 2007 Chapter 10
Microsoft Visual Basic 2005: Reloaded Second Edition
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Abigail Sydnes, Dmitriy Yarmaliuk, Dawson Allen, Shawn Billy
Lesson 13 - Cleaning Data Lesson 14 - Creating Summary Tables
Hello and Welcome! Introduction Syllabus MyStatLab demo
Data Tables, Indexes, pandas
Access Busn 216.
CHAPTER 1: Picturing Distributions with Graphs
Sihua Peng, PhD Shanghai Ocean University
Relations in Categorical Data
Pivot Explorer for SharePoint Lists ‒ Introduction
Tidy data, wrangling, and pipelines in R
Computing A Variable Mean
Lecture 6: Data Quality and Pandas
Lesson 13 - Cleaning Data Lesson 14 - Creating Summary Tables
PYTHON 3.x.
A Brief Introduction to Stata(2)
Chapter 2 Excel Extension: Now You Try!
Chapter 7 Database Basics
Presentation transcript:

1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt & Eric D. Stolen Working In R (with speaker notes)

2 An Introduction – UCF, Methods in Ecology, Fall 2008 What We Will Learn More About Dataframes Slicing, Dicing and Sorting Data Manipulating Dataframes & Aggregating Data Simple Iterative Processing Basic Data Visualization

3 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes  Dataframes Reviewed –Are objects that consist of rows and columns –Closely related to MS Access or Excel tables –Specific requirements for dataframes  Observations are in rows  The response variable and explanatory variables are in columns  Same variable results go into the same column

4 An Introduction – UCF, Methods in Ecology, Fall 2008 ObservationResponseTreatment Obs11.5A Obs21.6B Obs31.3C ………  Dataframes Reviewed –Specific requirements for dataframes (continued)  Observations are in rows  The response variable and explanatory variables are in columns More About Dataframes

5 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes ControlX1X IDResponseTreatment11.3Control 21.1Control 31.4Control 41.4X1 51.3X1 61.5X1 71.3X2 81.6X2 91.8X2  Dataframes Reviewed –Specific requirements for dataframes (continued)  Same variable results go into the same column  Good dataframe conventions

6 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes  Loading Dataframes with Data –Use the read.table() series of functions  Particularly simple and efficient is the read.delim function –Load Snake dataset and get acquainted  snake <- read.delim(“c:\\pract_i-t\\snake_data.txt", row.names="ID")  names(snake)  snake[1:5,]  summary(snake)

7 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes  Loading Dataframes with Data (continued) –Identifying columns as categorical variables  During read operations alpha-numeric fields are automatically encoded as factors  Numeric fields are automatically assumed to be continuous values  Identify numeric fields as factors using: –snake$landc <- factor(snake$landc) –snake$SEX <- factor(snake$SEX)  summary(snake)

8 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data  Subscripts –Performing data extraction by indexing:  snake[1:5,]  snake[40,]  snake[,1]  snake[,c(7, 1, 2)]  snake[-c(10, 20, 30, 40), c("Name", "SEX", "ha")]

9 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data  Subscripts (continued) –Performing data extraction by using filtering:  snake[snake$landc == 1,]  snake[snake$landc %in% c(1, 3),]  snake[snake$mcp > 200,]  snake[snake$mcp > 200 & snake$times 200 & snake$times < 60,]  snake[grep("^m", snake$Name ),]

10 An Introduction – UCF, Methods in Ecology, Fall 2008 Slicing, Dicing and Sorting Data  Sorting Data –snake[order(snake$Name),] –snake[order(snake$landc, -snake$mcp),] –snake[order(-snake$times),][1:10,]

11 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Adding Columns to a Dataframe –Close out and restart R –Load the rapid fish dataset  fish <- read.delim("c:\\pract_i-t\\rapid_fish.txt", row.names="ID")  rapid$impoundment <- factor(rapid$impoundment)  rapid$season <- factor(rapid$season)  rapid$open_veg <- factor(rapid$open_veg)  rapid$sea_code <- factor(rapid$sea_code)  rapid$imp_code <- factor(rapid$imp_code)  rapid$cov_code <- factor(rapid$cov_code)

12 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Adding Columns to a Dataframe (cont.) –Add “unique” column  rapid$unique <- paste(rapid$Point, rapid$season, sep = "")  rapid$unique <- factor(rapid$unique) –Add log transformation of count column  rapid$lncount <- log(rapid$count + 1)

13 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Overview the Rapid Fish Dataset –names(rapid) –rapid[1:5,] –summary(rapid)  Filtered Summaries –summary(rapid[rapid$open_veg == "open",]) –summary(rapid[rapid$open_veg == "vegetated",])

14 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Cross Tabulation –Using the table() function  table(rapid$impoundment)  table(rapid[, c("open_veg", "impoundment")])  table(rapid[, c("open_veg", "impoundment", "season")])

15 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Data Aggregation –Using the aggregation() function  aggregate(rapid$count, list(impoundment=rapid$impoundment), mean)  aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), mean)

16 An Introduction – UCF, Methods in Ecology, Fall 2008 Manipulating Dataframes & Aggregating Data  Merging Results –Using the merge() function  n <- table(rapid[, c("open_veg", "impoundment", "season")])  m <- aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), mean)  names(m)[4] <- "mean"  s <- aggregate(rapid$count, list(open_veg=rapid$open_veg, impoundment=rapid$impoundment, season=rapid$season), sd)  names(s)[4] <- "sd"  nm <- merge(n, m)  names(nm)[4] <- "n"  habimpsea <- merge(nm, s)  habimpsea

17 An Introduction – UCF, Methods in Ecology, Fall 2008 Simple Iterative Processing  Repeating a Process –Using the for() control-flow construct  site <- sort(unique(rapid$impoundment))  for (i in 1:length(site)) {  print (summary(rapid[rapid$impoundment == site[i],]))  }

18 An Introduction – UCF, Methods in Ecology, Fall 2008 Basic Data Visualization  Visualizing Data –Using the plot() function  plot(as.numeric(rapid$sea_code), rapid$count) –Using the boxplot() function  boxplot(rapid$count~rapid$impoundment) –Using the hist() function  par(mfrow=c(1,2))  hist(rapid$count)  hist(rapid$lncount)

19 An Introduction – UCF, Methods in Ecology, Fall 2008 The End