DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging

Slides:



Advertisements
Similar presentations
Haas MFE SAS Workshop Lecture 3:
Advertisements

Two-Dimensional Arrays Chapter What is a two-dimensional array? A two-dimensional array has “rows” and “columns,” and can be thought of as a series.
Introduction to SQL Session 2 Retrieving Data From Multiple Tables.
A Guide to SQL, Seventh Edition. Objectives Create a new table from an existing table Change data using the UPDATE command Add new data using the INSERT.
Basic And Advanced SAS Programming
PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science.
Adding and Subtracting Decimals. Essential Question: How do I add and subtract decimals? Always line up decimals Add and subtract like you always do Bring.
Computer Science 101 Circuit Design Algorithm. Circuit Design - The Problem The problem is to design a circuit that accomplishes a specified task. The.
ADVANCED EXCEL FORMULAS 1 Lesson 8. Named Ranges Name a cell or a range of cells Can make formulas easy to understand =SUM(Sales) instead of =SUM(A2:A16)
Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.
Relational Databases Database Driven Applications Retrieving Data Changing Data Analysing Data What is a DBMS An application that holds the data manages.
NULLs & Outer Joins Objectives of the Lecture : To consider the use of NULLs in SQL. To consider Outer Join Operations, and their implementation in SQL.
1 Summary HRP223 – 2009 November 1 st, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
Chapter 3 Query and Report. Agenda Report types Report contents Report creation Report design view Query and dynaset Function and grouping Action query.
PERFORMING CALCULATIONS Microsoft Excel. Excel Formulas A formula is a set of mathematical instructions that can be used in Excel to perform calculations.
A Guide to SQL, Eighth Edition Chapter Six Updating Data.
Programming in R Subset, Sort, and format data. In this session, I will introduce the topics: Subsetting the observations in a data frame. Sorting a data.
R Workshop #2 Basic Data Analysis. What we did last week: Understand the basics of how R works Generated objects (vectors, matrices, etc.) Read in data.
Sum of Arithmetic Sequences. Definitions Sequence Series.
Matrix Multiplication The Introduction. Look at the matrix sizes.
A Guide to MySQL 6. 2 Objectives Create a new table from an existing table Change data using the UPDATE command Add new data using the INSERT command.
Quiz Which of the following is not a mandatory characteristic of a relation? Rows are not ordered (Not required) Each row is a unique There is a.
IFS180 Intro. to Data Management Chapter 10 - Unions.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 7 & 10 By Tasha Chapman, Oregon Health Authority.
Introduction to R user-friendly and absolutely free
Introduction to Calculated Columns Variables, Conditionals, and String Manipulation PRESENTER: Cameron Blashka| Informer Implementation Specialist| April.
More SQL: Complex Queries,
DATA MANAGEMENT MODULE: USING SQL in R
CpSc 3220 The Language of SQL
Putting tables together
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
Welcome to Math’s Tutorial Session-3 Data handling
DATA MANAGEMENT MODULE: Subsetting and Formatting
DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging
Data Management Module: Concatenating, Stacking, Merging and Recoding
Power Query The Best thing that ever happened in Excel
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
ECONOMETRICS ii – spring 2018
DATA MANAGEMENT MODULE: USING SQL in R
Correlation and Regression Basics
By Don Henderson PhilaSUG, June 18, 2018
Lesson 4.2 Adding and Subtracting Decimals
Lesson 4.2 Adding and Subtracting Decimals
Adding and Subtracting Decimals
R Data Manipulation Bootstrapping
Vectors and Matrices Chapter 2 Attaway MATLAB 4E.
DATA MANAGEMENT MODULE: Managing Variables
Bivariate Testing (Chi Square)
HMI 7530– Programming in R STATISTICS MODULE: Confidence Intervals
HMI 7530– Programming in R STATISTICS MODULE: Basic Data Analysis
Adding and Subtracting Decimals
HMI 7530– Programming in R Introduction
STAT 4030 – Programming in R Introduction
DATA MANAGEMENT MODULE: Subsetting and Formatting
DATA MANAGEMENT MODULE: Managing Variables
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Lesson 35 Adding and Subtracting Decimals
Combining Data Sets in the DATA step.
Data Management Module: Subset, Sort, and Format data
CSCI N317 Computation for Scientific Applications Unit R
Vectors and Matrices Chapter 2 Attaway MATLAB 4E.
Data Management Module: Creating, Adding and Dropping Variables
Adding and Subtracting Decimals
Lecture 5 Binary Operation Boolean Logic. Binary Operations Addition Subtraction Multiplication Division.
Adding and Subtracting Decimals
Adding and Subtracting Decimals
Chapter 3 Query and Report.
Lesson 37 Adding and Subtracting Decimals
boolean Expressions Relational, Equality, and Logical Operators
Presentation transcript:

DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging HMI 7530 – Programming in R DATA MANAGEMENT MODULE: Concatenating, Stacking and Merging Jennifer Lewis Priestley, Ph.D. Kennesaw State University 1

DATA MANAGEMENT MODULE Importing and Exporting Imputting data directly into R Creating, Adding and Dropping Variables Assigning objects Subsetting and Formatting Merging, Stacking and Recoding Using SQL in R 2 2 2

Data Management Module: Concatenating To “concatenate” basically means to bring together columns (vectors) of data. In R, this is accomplished through the function cbind: Newdata <- cbind(data1, data2) This will create as many columns are in the sum of data1 and data2. Note that a “matchkey” is not needed. 3

Data Management Module: Stacking To “stack” basically means to bring together rows of data. In R, this is accomplished through the function rbind: Newdata <- rbind(data1, data2) This will create as many rows are in the sum of data1 and data2. Note that there MUST be the same column names in data1 and data2. Note that a “matchkey” is not needed. 4

Data Management Module: Merging To “Merge” basically means to bring together dataframes. In R, this is accomplished through the function merge: Newdata <- merge (data1, data2, by="PrimaryKey", all="TRUE") Note that all = TRUE will include all rows and columns for both data1 and data2 – essentially an outer join. all=FALSE will include only rows and columns that are present in both data1 and data2 – essentially an inner join. Note that a “matchkey” IS needed. 5

Data Management Module: Missing Values At this point, lets recode values using the same logic you would use in Excel: IF(Condition, value if true, value if false) In R: newvariable<-ifelse(oldvariable test, value if true, value if false) 6