Data in R. General form of data ID numberSexWeightLengthDiseased… 112m4.5338.60… 256f3.61 NA1… 3……………… 4……………… n91m5.1711… NOTE: A DATASET IS NOT A MATRIX!

Slides:



Advertisements
Similar presentations
CC SQL Utilities.
Advertisements

Course in Statistics and Data analysis Course B DAY2 September 2009 Stephan Frickenhaus
Writing functions in R Some handy advice for creating your own functions.
Creating New Financial Statements In Excel Presented by: Nancy Ross.
Basics of Using R Xiao He 1. AGENDA 1.What is R? 2.Basic operations 3.Different types of data objects 4.Importing data 5.Basic data manipulation 2.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
Introduction to GTECH 201 Session 13. What is R? Statistics package A GNU project based on the S language Statistical environment Graphics package Programming.
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 6 1 Microsoft Office Excel 2003 Tutorial 6 – Working With Multiple Worksheets.
How to Use the R Programming Language for Statistical Analyses Part I: An Introduction to R Jennifer Urbano Blackford, Ph.D. Department of Psychiatry Kennedy.
FEBRUARY, 2013 BY: ABDUL-RAUF A TRAINING WORKSHOP ON STATISTICAL AND PRESENTATIONAL SYSTEM SOFTWARE (SPSS) 18.0 WINDOWS.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.
3. Functions and Arguments. Writing in R is like writing in English Jump three times forward Action Modifiers.
ALEXANDER C. LOPILATO R: Because the names of other stat programs don’t make sense so why should this one?
Introduction to SPSS Edward A. Greenberg, PhD
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Introduction to to R Emily Kalah Gade University of Washington Credit to Kristin Siebel for development of much of this PowerPoint.
Programming in R Getting data into R. Importing data into R In this session we will learn: Some basic R commands How to enter data directly into R How.
Session 3: More features of R and the Central Limit Theorem Class web site: Statistics for Microarray Data Analysis.
Data Objects in R Vector1 dimensionAll elements have the same data types Data types: numeric, character logic, factor Matrix2 dimensions Array2 or more.
Knowing Understanding the Basics Writing your own code SAS Lab.
Data Analysis Lab 02 Using Crosstabs to compare percentages.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
R Programming Yang, Yufei. Normal distribution.
CMPS 1371 Introduction to Computing for Engineers FILE Input / Output.
Feedback ELearning in Sakai. Feedback UseExport GradebookWorking in ExcelPost FileView FeedbackUpdate, Download, or Delete.
R packages/libraries Data input/output Rachel Carroll Department of Public Health Sciences, MUSC Computing for Research I, Spring 2014.
SW318 Social Work Statistics Slide 1 Get ready to work on practice problems 1. Create a directory and subdirectory on your computer named C:\StudentData\SW318_Spring_2004.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
Analysis Introduction Data files, SPSS, and Survey Statistics.
Introduction to R Carol Bult The Jackson Laboratory Functional Genomics (BMB550) Spring 2011.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Missing Values C5.2 Data Screening. Missing Data Use the summary function to check out the missing data for your dataset. summary(notypos)
2/18/14 Terri Shkuda Research Informatics
Non-standard ASCII to netCDF. CF Conventions REQUIRE Latitude Longitude Date/Time …for EVERY observation.
Use SPSS for solving the problems Lecture#21. Opening SPSS The default window will have the data editor There are two sheets in the window: 1. Data view2.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Data & Graphing vectors data frames importing data contingency tables barplots 18 September 2014 Sherubtse Training.
Data Analysis with SPSS Lee Pierce Keith Mulbery Jason Archibald.
Education And Training CTC IT DIVISION PivotLink User Training April 2010.
Descriptive Statistics using R. Summary Commands An essential starting point with any set of data is to get an overview of what you are dealing with You.
Review > x[-c(1,4,6)] > Y[1:3,2:8] > island.data fishData$weight[1] > fishData[fishData$weight < 20 & fishData$condition.
Working with data in R 2 Fish 552: Lecture 3. Recommended Reading An Introduction to R (R Development Core Team) –
If sig is less than 0.05 (A) then the test is significant at 95% confidence (B) then the test is significant at 90% confidence (C) then the test is significant.
Survey Training Pack Session 14 – Transferring CSPro, Access and Excel Files to SPSS.
Introduction to R user-friendly and absolutely free
R basics workshop Sohee Kang Math and Stats Learning Centre
Sihua Peng, PhD Shanghai Ocean University
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
Introduction to R Carolina Salge March 29, 2017.
Introduction to R Studio
Data File Import / Export
Uploading and handling databases
Functions (subprograms)
Introduction to SPSS 16.0 Part 2
R basics workshop Sohee Kang Math and Stats Learning Centre
DATA MANAGEMENT MODULE: Getting Data Into and Out of R
ECONOMETRICS ii – spring 2018
September 11, Ian R Brooks Ph.D.
Sihua Peng, PhD Shanghai Ocean University
Relations in Categorical Data
TRAINING OF FOCAL POINTS on the CountrySTAT SYSTEM based on FENIX
funCTIONs and Data Import/Export
Basics of R, Ch Functions Help Managing your Objects
CSCI N317 Computation for Scientific Applications Unit R
Plan of the Day: Friday, Sep 5, 2014.
Working With Data.
Retrieving numerical values.
TransCAD Working with Matrices 2019/4/29.
Stat 251 (2009, Summer) Lab 2 TA: Yu, Chi Wai.
Presentation transcript:

Data in R

General form of data ID numberSexWeightLengthDiseased… 112m … 256f3.61 NA1… 3……………… 4……………… n91m5.1711… NOTE: A DATASET IS NOT A MATRIX! WHY?

Datasets in R A dataset in R is a collection of vectors One vector includes observations of one variable Length of the vectors is the number of observations, e.g. number of sampled individual A line in a dataset is one “data unit”, i.e. one observation, information collected from one individual Dataset can be created by attaching together measurement vectors using data.frame() function When creating a dataset, names of the variables (vectors) can be given data=data.frame(ID=aa, Sex=bb, …)

Anatomy of a data frame Structure of a dataset: –“name of the dataset” $ “name of the variable”: e.g data$ID –> In this way you can point to one variable in a dataset Rows, columns and individual variable values –Column: “name of the dataset” [, “column number of the variable”], e.g. data[,1] OR data$ID –Row: “name of the dataset”[“row number”,], e.g. data[1,] –Cells: “name of the dataset[“row number”,”column number”], e.g. data[1,1] OR data$ID[ “row number” ] Basic information about a dataset –Structure: str(“name of the dataset”) –Dimensions: dim(“name of the dataset”) –Basic statistics: summary(“name of the dataset”) -> DEMO 1

Importing and exporting data Most commonly, data has to be exported to R from excel There is an R package for this library(read.xls) But, most universal way is to import data to R is to first save it as a.txt file, and read it by read.table() To make this easy: –Save missing values as NA –Separate decimals by. not by, –Separate variables by tabs –Do not leave empty spaces to variable names or values

Function read.table() read.table(file, header = T, sep = "\t",…) file e.g. “F:/data.txt” sep e.g "," or "\t" If decimals separated with, then dec="," Success of the data reading can be checked by: dim( “name of the dataset” ) summary( “name of the dataset” ) Raw data can be also viewed and edited in R using Data editor fix (“name of the dataset”)

Exporting data from R Data files can be exported from R using write.table() function write.table( ‘name of the data’, “file path”, sep = “ “,… ) Or using a write.csv() -> DEMO 2

Exploring data loaded to R Once again, summary() and dim() are the first functions to investigate the contents and size of the data It is important to check if the variable types are correct!!! (use e.g. summary() for this) For categorically structured data, tapply() function is very handy: tapply (target vector, list(factor1, factor2, …), ‘function to be applied’, na.rm=T) This procedure returns function values for every combination of categories of the factors given in the list

Some additional utilities R contains a large variety of dataset built in to R –To get a list of those: data() Pointing to variables without $ attach( ”name of the dataset” ) To remove this effect detach( ”name of the dataset” ) DEMO 3