Download presentation
Presentation is loading. Please wait.
1
Bioinformatics for biologists (2)
Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented at University of Texas, Health Science Center – San Antonio 25 March 2015
2
Session 2 Part 1 Pathway and functional analyses (String manipulation in R, InnateDB, MouseMine) Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
3
List of proteins We start wit a list of proteins obtained from mass spectrometry (MS). A sample protein_MS.xlsx file is provided in the workshop material. It was exported from Scaffold software and contributed by Dr. Janice Deng. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
4
Converting xlsx to csv format
In Excel, click on File>Save as... Choose .csv format. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
5
Extracting and saving RefSeq IDs
Open R and run the following. You can copy from the cheat sheet. ## Import the csv file in R. r1 <- read.csv("protein_MS.csv",stringsAsFactors=FALSE) ## We consider only this column. numbers <- r1[,"Accession.Number"] ## Extracting and saving RefSeq IDs. inds <- grep(numbers,pattern="ref") numbers <- numbers[inds] numbers <- gsub(numbers,pattern=".*ref\\|",replacement="") numbers <- gsub(numbers,pattern="\\|.*",replacement="") ## No row or column names. No quotations. write.table(numbers,file="refseq.csv",row.names=FALSE,col.names=FALSE, sep=",", quote=FALSE) Choose .csv format. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
6
Extracting and saving RefSeq IDs
Open R and run the following. You can copy from the cheat sheet. ## Import the csv file in R. r1 <- read.csv("protein_MS.csv",stringsAsFactors=FALSE) ## We consider only this column. numbers <- r1[,"Accession.Number"] ## Extracting and saving RefSeq IDs. inds <- grep(numbers,pattern="ref") numbers <- numbers[inds] numbers <- gsub(numbers,pattern=".*ref\\|",replacement="") numbers <- gsub(numbers,pattern="\\|.*",replacement="") ## No row or column names. No quotations. write.table(numbers,file="refseq.csv",row.names=FALSE,col.names=FALSE, sep=",", quote=FALSE) numbers is one column from r1 matrix. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
7
Extracting and saving RefSeq IDs
Open R and run the following. You can copy from the cheat sheet. ## Import the csv file in R. r1 <- read.csv("protein_MS.csv",stringsAsFactors=FALSE) ## We consider only this column. numbers <- r1[,"Accession.Number"] ## Extracting and saving RefSeq IDs. inds <- grep(numbers,pattern="ref") numbers <- numbers[inds] numbers <- gsub(numbers,pattern=".*ref\\|",replacement="") numbers <- gsub(numbers,pattern="\\|.*",replacement="") ## No row or column names. No quotations. write.table(numbers,file="refseq.csv",row.names=FALSE,col.names=FALSE, sep=",", quote=FALSE) At each step, type “numbers” to follow the process. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
8
The csv file Open the refseq.csv file and make sure it is in appropriate format, e.g., no row or column names. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
9
Click on Pathway Analysis and then Analysis
InnateDB Use to perform pathway and network analysis. Click on Pathway Analysis and then Analysis Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
10
Uploading data to InnateDB
Click on Upload a file. Upload the refseq.csv file you created using R. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
11
Uploading data to InnateDB
Choose Cross-reference ID Choose Ensembl and click on OK. Click on Column 1. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
12
Pathway overrepresentation analysis
Pathway analysis Pathway overrepresentation analysis Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
13
Settings Leave the defaults
Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
14
Pathway analysis results
Keep mouse on a column to see overlap wit the pathways. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
15
Pathway analysis results
Click to choose database. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
16
Pathway analysis results
Click on Details to see the overlap. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
17
Pathway analysis results
The overlap. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
18
Other analyses Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
19
Gene Ontology Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
20
Gene Ontology results Click to choose.
Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
21
Moving the mouse over nodes highlights interactions.
Network analysis Moving the mouse over nodes highlights interactions. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
22
Click on advanced to upload the file.
MouseMine Use to see pathway and functional analyses results on one page. Click on advanced to upload the file. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
23
UniPort keywords Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
24
Click on the number of Matches for more details.
Pathway enrichment Click on the number of Matches for more details. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
25
You can download your favorite table.
Pathway enrichment You can download your favorite table. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
26
Click on a protein name for more details.
Pathway enrichment Click on a protein name for more details. Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
27
Pathway enrichment Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
28
References: InnateDB Lynn, David J., et al. "InnateDB: facilitating systems‐level analyses of the mammalian innate immune response." Molecular systems biology 4.1 (2008). MouseMine Motenko, H., Neuhauser, S.B., O'Keefe, M., and Richardson, J.E., MouseMine: a new data warehouse for MGI. Mamm Genome, (7-8): Bioinformatics for biologists workshop (2), Dr. Habil Zare, Oncinfo Lab UTHSC San Antonio, 25 March 2016
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.