Data Management Module: Creating, Adding and Dropping Variables

Slides:



Advertisements
Similar presentations
Geographic Information Systems GIS Data Models. 1. Components of Geographic Data Spatial locations Attributes Topology Time.
Advertisements

Intro to Excel - Session 5.31 Tutorial 5 - Session 5.3 Working with Excel Lists.
Lecture Access – Tables. What are Tables? Records Fields.
A Guide to SQL, Seventh Edition. Objectives Understand, create, and drop views Recognize the benefits of using views Grant and revoke user’s database.
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
Spreadsheets in Finance and Forecasting Presentation 7: Forms and Templates.
Little Linear Algebra Contents: Linear vector spaces Matrices Special Matrices Matrix & vector Norms.
Copyright GoldSim Technology Group LLC, 2006 Slide 1 Linking GoldSim to Spreadsheets GoldSim Technology Group Issaquah, Washington USA.
An Animated Guide©: Sending SAS files to Excel Concentrating on a D.D.E. Macro.
Intro to Excel Spreadsheets. Cell Reference (Cell Name) The unique identifier associated with a cell The unique identifier associated with a cell Combine.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
1 Chapter 6 Database Administration. 2 Introduction Database administration The process of managing a database Database administrator A person or an entire.
HTML ( HYPER TEXT MARK UP LANGUAGE ). What is HTML HTML describes the content and format of web pages using tags. Ex. Title Tag: A title It’s the job.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
ENG College of Engineering Engineering Education Innovation Center 1 Array Accessing and Strings in MATLAB Topics Covered: 1.Array addressing. 2.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
1 Insert Tab. 2 Then choose a cover page you want and you can easily edit it.
Systems of Linear Equations in Vector Form Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB.
Programming in R Managing Variables. Managing variables in R In this session I will show you how to: Rename, drop and keep variables Create new variables.
Questions/problems with Data Export Wizard 27 Feb
Learn R Toolkit D Kelly O'DayExcel & R WorldsMod 2 - Excel & R Worlds: 1 Module 2 Moving Between Excel & R Worlds Do See & HearRead Learning PowerPoint.
Linear Transformations
R Workshop #2 Basic Data Analysis. What we did last week: Understand the basics of how R works Generated objects (vectors, matrices, etc.) Read in data.
1 CS 430 Database Theory Winter 2005 Lecture 13: SQL DML - Modifying Data.
Lab 9: practice with functions Some tips to make your functions a little more interesting.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Mail Merge Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter.
Chapter 3: Getting Started with Tasks
C++ / G4MICE Course Session 4 Create a complete C++ class
DATA MANAGEMENT MODULE: USING SQL in R
Haas MFE SAS Workshop Lecture 3:
CHAPTER 7 DATABASE ACCESS THROUGH WEB
Excel Training - Part One
© 2016, Mike Murach & Associates, Inc.
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
Database application MySQL Database and PhpMyAdmin
Linear Transformations
DATA MANAGEMENT MODULE: Subsetting and Formatting
DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging
Linear Transformations
Contract Compliance: Search
Data Management Module: Concatenating, Stacking, Merging and Recoding
Systems of Linear Equations
Working with Data in Windows
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
DATA MANAGEMENT MODULE: USING SQL in R
Chapter 18: Modifying SAS Data Sets and Tracking Changes
Chapter 1: Introduction to SAS
DATA MANAGEMENT MODULE: Managing Variables
PROC DOC III: Self-generating Codebooks Using SAS®
What’s New in Colectica 5.3 Part 2
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
variables and control statements in PL\SQL
DATA MANAGEMENT MODULE: Subsetting and Formatting
DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging
DATA MANAGEMENT MODULE: Managing Variables
CSCI N207 Data Analysis Using Spreadsheet
Statistics 540 Computing in Statistics
Data Management Module: Subset, Sort, and Format data
Structured Types (9-12-2) Structured types allow composite attributes of E-R diagrams to be represented directly. Unnamed row types can also be used in.
Multidimensional array
Data Analysis Module: Chi Square
Chapter 7 Using SQL in Applications
Topic 8 – Pivot tables and Charts Lesson 1 – Pivot tables
Data Manipulation (with SQL)
A drag and drop exercise can be created using Word quite easily using tables, text boxes and ensuring the document is saved properly.
Matrix Multiplication Sec. 4.2
Cascading Style Sheets CSS LAYOUT WITH GRID
SDMX Converter Abdulla Gozalov, UNSD.
Presentation transcript:

Data Management Module: Creating, Adding and Dropping Variables Programming in R Data Management Module: Creating, Adding and Dropping Variables

Data Management Module Importing and Exporting Imputting data directly into R Creating, Adding and Dropping Variables Assigning objects Subsetting and Formatting Working with SAS Files Using SQL in R

Managing Variables: Accessing Variables There are two basic ways to access variables: You can reference the column number or the variable number You can reference the variable name. PRIESTLEY/STAT4030

Managing Variables: Accessing Variables Data frames in R are special matrices. Matrices have the concept of rows and columns (rXc). To access a cell X[r,c] To access a column X[,c] To access a row X[r,] PRIESTLEY/STAT4030

Managing Variables: Accessing Variables For example, this will access all rows and column 1: fallsurvey[,1] This will access row 1 and all columns: fallsurvey[1,] This will access just the first obs in the first column: fallsurvey[1,1] PRIESTLEY/STAT4030

Managing Variables - Keep There are a few different ways to specify the variables to keep in the data set We can specify specific columns by the column number We can also keep columns by specifying the name PRIESTLEY/STAT4030

Managing Variables - Keep # Keep based on column number # fallsurvey1 <- fallsurvey[,1:2] head(fallsurvey1) str(fallsurvey1) # Keep based on variable name # fallsurvey2 <- fallsurvey[,c("Sem...Year","Adj.GPA")] head(fallsurvey2) PRIESTLEY/STAT4030

Managing Variables - Drop Similarly, there are a few different ways to specify the variables to keep in the data set. We can specify specific columns by the column number: # Drop based on column number fallsurvey1 <- fallsurvey[,-2] head(fallsurvey1) fallsurvey2 <- fallsurvey[,names(fallsurvey)[c(-2,-3)]] head(fallsurvey2) While variables CAN be dropped by referencing the variable name, this is more difficult and will be covered later. PRIESTLEY/STAT4030

Managing Variables - Renaming To rename a variable, you need to use the “names” function. Basically, here you are calling the function, identifying the dataset, then identifying the vector (column) that you are renaming: names(fallsurvey)[names(fallsurvey)=="Sem...Year"] <- "Sem/Year" str(fallsurvey) PRIESTLEY/STAT4030

Managing Variables - New variables Variables can be easily created from other variables available in the data set. Variables can be created in the context of SQL statements. Care must be taken to ensure the resulting variable appears in the expected location. PRIESTLEY/STAT4030

Managing Variables - New variables Think about the difference between these two sets of code… total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(total.drinks) ls() fallsurvey$total.drinks <- fallsurvey$Drinks.before.Noon+fallsurvey$Drinks.after.Noon head(fallsurvey$total.drinks) head(fallsurvey) PRIESTLEY/STAT4030