Presentation is loading. Please wait.

Presentation is loading. Please wait.

Uploading and handling databases

Similar presentations


Presentation on theme: "Uploading and handling databases"— Presentation transcript:

1 Uploading and handling databases
Maria Novosolov

2 Understanding datasets
Example: Continuous variable Date variable Ordinal variable Case identifier Nominal variable

3 variables Type of variables: Continuous Categorical Nominal Ordinal
Factors* *determine how data will be analyzed and presented visually

4 factors The function factor() stores the categorical values as a vector of integers in the range [1... k] *k is the number of unique values in the nominal variable

5 example *Nominal variables diabetes <- c("Type1", "Type2", "Type1", "Type1") diabetes <- factor(diabetes) 2 = Type 2 1 = Type 1 vector as (1, 2, 1, 1)

6 example *Ordinal variables status <- c("Poor", "Improved", "Excellent", "Poor") status <- factor(status, ordered=TRUE) 2 = Improved 1 = Excellent 3 = Poor vector as (3, 2, 1, 3)

7 Important! By default, factor levels for character vectors are created in alphabetical order To change it: status <- factor(status, ordered=TRUE, levels=c("Poor", "Improved", "Excellent"))

8 Data structures

9 Data structures

10 Data structures

11 Data structures

12 Data structures

13 used to indicate a particular variable
The $ used to indicate a particular variable from a given data frame Example: > table(patientdata$diabetes, patientdata$status) Excellent Improved Poor Type Type

14 The attach() function adds the data frame to the R search path

15 The detach() function removes the data frame from the search path

16 Data input

17 Importing data files You can import data from:
delimited text files using read.table() Comma-separated values using read.csv() Excel spread sheet using read.xlsx()

18 Preparing your table Things you need to remember in preparing your data for R: Erase all irrelevant columns Change the column names to short and easy names Put NA in all the empty cells Change all the spaces to “_” with ctrl + H

19 import. See help(read.table) for details
Importing from txt Save your data in tab delimited txt Set the working directory – setwd() Use the function read.table() The read.table() function has many additional options for fine-tuning the data import. See help(read.table) for details In R: 1. setwd("your path to the file") 2. mydataframe <- read.table("file.name",header=T/F, sep="delimiter", row.names="name") Or: mydataframe <- read.table(“the.path.to.the.file+file.name",header=T/F,

20 Importing from csv Save your data in .csv file
Set working directory – setwd() Import your data using read.csv() In R: 1. setwd("your path to the file") 2. mydataframe <- read.csv("file.name.csv",header=T/F, Sep=“comma", row.names="name") Or: mydataframe <- read.csv(“the.path.to.the.file+file.name.csv",header=T/F, sep=”comma", row.names="name")

21 Importing data from Excel
use the RODBC package to access Excel files The xlsx package can be used to access spreadsheets in xlsx format

22 RODBC package #installing the package in R install.packages("RODBC")
#Opens the package for use library(RODBC) #Myfile.xls = your file; mysheet = the sheet in the file channel <- odbcConnectExcel("myfile.xls") mydataframe <- sqlFetch(channel, "mysheet") odbcClose(channel)

23 Useful functions

24 Useful functions NA – not available NaN – not a number
The function is.na() allows you to test for the presence of missing values Example: y <- c(1, 2, 3, NA) then the function is.na(y) returns c(FALSE, FALSE, FALSE, TRUE)

25 Date values Dates are typically entered into R as character strings and then translated into date variables that are stored numerically The function as.Date() is used to make this translation. Syntex: as.Date(x, "input_format")

26 Date values Input format:

27 Date values Examples: strDates <- c("01/05/1965", "08/16/1975")
dates <- as.Date(strDates, "%m/%d/%Y") myformat <- "%m/%d/%y" leadership$date <- as.Date(leadership$date, myformat)

28 Type conversions


Download ppt "Uploading and handling databases"

Similar presentations


Ads by Google