Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010.

Similar presentations


Presentation on theme: "1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010."— Presentation transcript:

1 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010

2 2 What is R?

3 3 Usages of R statistics system data handling and storage facility calculations on arrays, in particular matrices integrated collection of tools for data analysis graphical tool for data analysis programming language (called ‘S’)

4 4 Structure of R R functions and dataset are stored in packages R is provided with 25 “standard” packages: Hundreds of contributed packages (written by different authors ) are available Package NameDescription base Base R functions datasetBase R datasets graphics R functions for base graphics statsR statistical functions utilsR utility functions matrixMatrix package classFunctions for classification cluster Functions for cluster analysis

5 5 R and Information Retrieval IR ConceptR package Text preprocessing Term weighting, scoring tm package: Constructs a term-document matrix, using one of the the following weighting functions TF (weightTf), TF-IDF (weightTfIdf). e.g. tdm <- TermDocumentMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE)) vector space model for scoringclv package: dot.product function returns a cosine similarity measure of two vectors. vector space classificationclass package: performs a k-Nearest Neighbour Classification on a dataset Hierarchical clusteringCluster package: computes clusters (agglomerative hierarchical ) on dataset Latent Semantic IndexingBase package: performs Singular Value Decomposition on matrix

6 6 Getting started with R To start R >R To quit R >q() To see installed packages >library() To load a package >library(class) To start help > help.start() To create a vector > x <- c(10.4, 5.6, 3.1, 6.4, 21.7) To create a matrix > x <- array(1:20, dim=c(4,5)) # Generate a 4 by 5 array filled with number from 1 to 20. To display an object >x To delete an object >rm x To load data from file >HousePrice <- read.table("houses.data")

7 7 Examples (1) Term-Document Matrix

8 8 Examples (1)

9 9 Examples (2) Eigenvalues and eigenvectors

10 10 Examples(3)

11 11 Examples(3) Law Rank approximation

12 12 Examples(3)

13 13 Examples(3)

14 14 Resources IIR Book http://www.r-project.org/ Questions?


Download ppt "1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010."

Similar presentations


Ads by Google