*Supported by the NSF Plant Genome Research and REU Programs

Slides:



Advertisements
Similar presentations
XP New Perspectives on Microsoft Office Word 2003 Tutorial 7 1 Microsoft Office Word 2003 Tutorial 7 – Collaborating With Others and Creating Web Pages.
Advertisements

Nvu - How to Create Your Own Website From Start to Finish!Dr. Dawn Sherry & Dr. Barry J. Monk Dr. Dawn Sherry Dr. Barry J. Monk Assistant Professor of.
Working with Tables for Page Design – Lesson 41 Working with Tables for Page Design Lesson 4.
A complete citation, notecard, and outlining tool
Cascading Style Sheets
Adobe Photoshop Elements photo editing software licensed for use on DEC computers can be purchased for home use by DEC staff Company name Name of.
Web Site Development Test 2 Working in DreamWeaver.
Chapter 6 Photoshop and ImageReady: Part II The Web Warrior Guide to Web Design Technologies.
Supported by the NSF Plant Genome Research and REU Programs *Supported by the NSF Plant Genome Research and REU Programs Tutorial of bioinformatics and.
Chapter 4 Adding Images. Inserting and Aligning Images Using CSS When you choose graphics to add to a web page, it’s important to use graphic files in.
Supported by the NSF Plant Genome Research and REU Programs *Supported by the NSF Plant Genome Research and REU Programs A tutorial on the Cell Wall Genomics.
Inventory Throughout this slide show there will be hyperlinks (highlighted in blue) follow the hyperlinks to navigate to the specified Topic or Figure.
PowerPoint: Tables Computer Information Technology Section 5-11 Some text and examples used with permission from: Note: We are.
With Alex Conger – President of Webmajik.com FrontPage 2002 Level I (Intro & Training) FrontPage 2002 Level I (Intro & Training)
TreeDyn Hands on 1 The purpose of this first hands on TreeDyn is simply to show very simply current operations you are probably familiar with using an.
Enhancing and Customizing a Presentation
Copyright © Texas Education Agency, All rights reserved. 1 Web Technologies Website Development with Dreamweaver.
Web Technologies Website Development Trade & Industrial Education
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp.
1 The Genome Browser allows you to –Browse the Rice-Japonica, Maize and Arabidopsis genomes. –View the location of a particular feature on the rice genome.
Website Development with Dreamweaver
HTML presentation Embedding Graphics in Web Pages n HTML uses an empty tag called the (image tag) n n n or n n n Note: all web production tools do insert.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
1 Data List Spreadsheets or simple databases - a different use of Spreadsheets Bent Thomsen.
Spreadsheets in Finance and Forecasting Presentation 9 Macros.
Photoshop & Fireworks Helps & Hints Visual Design for the Web March 2007.
Copyright OpenHelix. No use or reproduction without express written consent1.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. WORD 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 14 Tables.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
IT: Web Technologies: Web Animation 1 Copyright © Texas Education Agency, All rights reserved. 1 Web Technologies Designing Web Site Layout Using.
Chapter 4 Working with Frames. Align and distribute objects on a page Stack and layer objects Work with graphics frames Work with text frames Chapter.
PLACING AND LINKING GRAPHICS
Basic Local Alignment Search Tool BLAST Why Use BLAST?
GIS + ADOBE ILLUSTRATOR
My Presentations Create new presentations and find the ones you’ve worked on before. Your progress is saved automatically as you work. Undo Anything.
© 2010 Delmar, Cengage Learning Chapter 4 Working with Frames.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
Lesson 6 Formatting Cells and Ranges. Objectives:  Insert and delete cells  Manually format cell contents  Copy cell formatting with the Format Painter.
© 2011 Delmar, Cengage Learning Chapter 4 Working with Frames.
Computer Literacy for IC 3 Unit 2: Using Productivity Software Chapter 4: Importing Text and Formatting a Newsletter © 2010 Pearson Education, Inc. | Publishing.
This tutorial will describe how to navigate the section of Gramene that allows you to view various types of maps (e.g., genetic, physical, or sequence-based)
1 Preparation for site Create a folder in MyDocuments: beavercheese. Create a subfolder, images Classes, career, DW beginner Download.
ADOBE INDESIGN CS3 Chapter 4 WORKING WITH FRAMES.
What is BLAST? Basic BLAST search What is BLAST?
VBQU149 Create texts of some complexity. Columns Making columns in Microsoft Word Open word and blank page type =rand() Go to page layout, then columns.
Welcome to the combined BLAST and Genome Browser Tutorial.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
What is BLAST? Basic BLAST search What is BLAST?
Tutorial 3 Creating Animations.
Bioinformatics Research Group
Learning the Basics – Lesson 1
CONTENT MANAGEMENT SYSTEM CSIR-NISCAIR, New Delhi
Project Objectives Open an image Save an image Resize an image
Tutorial for using Case It for bioinformatics analyses
Central Document Library Quick Reference User Guide View User Guide
Chapter 1 Editing a Photo
MIS 201 Web Design.
Genome Center of Wisconsin, UW-Madison
Test Information Distribution Engine (TIDE) Training
Exercise 8 – Software skills
Using Netscape Page Composer
Basic Local Alignment Search Tool
Introduction to PowerPoint
Welcome to the GrameneMart Tutorial
Claudio H Slamovits, Naomi M Fast, Joyce S Law, Patrick J Keeling 
Bent Thomsen Institut for Datalogi Aalborg Universitet
Designing Web Site Layout Using Fireworks
Presentation transcript:

*Supported by the NSF Plant Genome Research and REU Programs Tutorial of bioinformatics and tree generation at the Cell Wall Genomics website Bryan Penning *Supported by the NSF Plant Genome Research and REU Programs Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Bioinformatics Goals We currently have a wealth of Arabidopsis thaliana cell wall gene information on the website, we wanted to: Add family information about rice and maize Type II cell walls to compare to A. thaliana Type I cell walls Add links to outside information on rice genes like we have for A. thaliana Include annotated composite trees of A. thaliana, rice and maize gene families Add links to sites used to generate the data Add source protein sequence used for our family trees so other researchers can make their own adding their genes of interest Generate a tutorial on how researchers can make use of the bioinformatics data on our site Supported by the NSF Plant Genome Research and REU Programs

Diagram of our bioinformatics approach Too few genes, Blast other sites N Genes from A. thaliana Homologous rice genes A thaliana & rice genes Good tree? Blast TIGR Choose genes Make tree Y Too many genes, tighten criteria N Diagram of the process used to find the genes and draw family trees for cell wall related rice genes. The same approach is used for maize. Publish to website Draw rice dendrogram Supported by the NSF Plant Genome Research and REU Programs

Diagram of our bioinformatics approach A. thaliana genes Draw tree with all family members Annotate Publish to web Rice genes Diagram of the process used to integrate cell wall related genes from all three family trees into a composite tree. Maize genes Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs BLASTing genes To be considerate of the bioinformatics community with the number of BLASTs to be performed and to speed the process, we downloaded the text or “flat file” of the TIGR rice protein sequences (available at: http://www.tigr.org/tdb/e2k1/osa1/data_download.shtml) and performed local blasts using blastall from NCBI (available at: http://www.ncbi.nlm.nih.gov/BLAST/download.shtml) Direction for use of these tools is available at the above sites and is beyond the scope of this tutorial For a small number of BLASTs, you can use web-based methods and common programs such as Word and Excel plus any of a number of downloadable tree drawing programs to make these kinds of trees on your own if you are not familiar with programming languages such as Perl to automate the process. Although web searches can be more time consuming, they work just as well for a few sequences Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Web BLASTing For smaller numbers of BLASTs to the rice genome, TIGR provides an excellent Web BLAST at: http://tigrblast.tigr.org/euk-blast/index.cgi?project=osa1 You can also use the new BLAST tool at Gramene: http://www.gramene.org/multi/blastview for most cereal sequences Note: gene model versions sometimes differ between Gramene and TIGR as one site may update to the latest model before the other Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Web BLASTing Downloading the protein sequence for Arabidopsis SUD1 (At3g46440) from TIGR, you can BLAST it against the TIGR Rice Pseudomolecules – Protein database using BLASTp Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Web BLASTing You get a series of “hits” to the gene of interest A higher score and smaller probability is a better match to the original gene This procedure is followed for all of the genes in a family to gather the best possible hits, sort the hits to remove duplicates and choose the best rice matches to the Arabidopsis families You can use NCBI’s blastall tool for multiple simultaneous blasts as we do for this step Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Organizing BLASTs This is a word document generated by BLASTing SUD1 and SUD2 of Arabidopsis against the TIGR Rice Protein database The hits were copied into word and set to the font Courier New, 9 pt and saved as a text only document (to remove the HTML code) The file was reloaded in Word and converted to a table (table menu) using other and the character | (shift \) to separate the columns Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Organizing BLASTs The Word file is copied into Excel and the Data – Sort menu is used to sort by the first column This brings all of the same named genes together (the two highlighted lines for example) Duplicate genes are removed from the spreadsheet and the far right column only (LOC_Osxxgxxxxxx) tags can be copied back to word Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Organizing BLASTs You can use the table menu to convert table to text (Paragraph Marks) to generate a list of genes These genes can be searched through a downloaded database using the NCBI fastacmd (included in the BLAST download tools) or you can search them one at a time using a web-based database such as the locus search name on TIGR: (http://www.tigr.org/tdb/e2k1/osa1/LocusNameSearch.shtml) Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Generating a tree Once you have found all of your sequences, check that each sequence name has a < in front of it (denoting a new sequence name) and the sequence starts on a new line Copy and paste all of your sequences into an alignment program like ClustalW (we use: http://align.genome.jp/ from the Kyoto University Bioinformatics center, but any ClustalW program will work) Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Generating a tree For our trees we use: Slow/Accurate pair-wise comparisons and Gonnet for our Weight Matrix (two spots on the website) Click execute alignment to get your sequence alignment At the end of the alignment page will be the information needed for tree drawing programs You can click on clustal.dnd for a quick tree or take the information after it – A Newick format tree and copy it into a new Word file, saving it as a text file (include all parenthesis) Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Creating a tree We use the program TreeDyn to generate our trees (available at: http://www.treedyn.org/) This is an example of the Arabidopsis and rice 1.1 family The tree text file was loaded into TreeDyn and the frame enlarged The red text for Arabidopsis sequences was done by changing the font color to red and using the find panel to find all At* sequences (which turn red) The scale at the bottom was added by right clicking on that space and choosing the tree name, annotation, and scale sub-menus This square tree is useful to see associations of genes for different species Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Square tree example This is part of the family 1.1 square dendrogram of Arabidopsis, rice and maize from our website The red names are Arabidopsis sequence, the black names are rice, and the green names are maize Regions alternate between grey shaded and white backgrounds (added with Photoshop) to indicate clades of similar sequence genes which may relate function (such as AUD/SUD or GME, etc) Supported by the NSF Plant Genome Research and REU Programs

Supported by the NSF Plant Genome Research and REU Programs Radial dendrograms TreeDyn can also draw radial dendrograms such as the one shown for rice family 1.1 This can be done by right clicking on the tree area to bring up the grey box in TreeDyn, choosing your tree, then Conformation- Radial Treedyn allows you to resize, rotate, and flip clades around (see http://www.treedyn.org/ for detailed tutorials on these processes) For our site, we export the radial trees as jpeg images Supported by the NSF Plant Genome Research and REU Programs

Finishing a radial dendrogram The TreeDyn tree jpeg is finished as a FLASH file where the ovals and family names are added (Rice family 1.1 shown) Each individual clade of a family tree is also prepared in TreeDyn and link buttons added later in FLASH (AUD/SUD-like shown) Supported by the NSF Plant Genome Research and REU Programs

Viewing your gene of interest We provide protein sequence information you can download and add in your own sequence of interest for comparison to these three species Under each tree (family 1.1 shown) is the link “View the protein sequence file” Right click and choose Save Target as… to download the sequence with a filename and location you will remember You can do this for each Arabidopsis, rice, and maize family Supported by the NSF Plant Genome Research and REU Programs

Viewing your gene of interest You may have a sequence you think is related to a particular family such as nucleotide interconversion pathway (family 1.1) For example, the wheat EST CV523101 from Genebank: http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=CV523101 might be related to the TIGR rice gene: Os05g29990 in the AUD/SUD clade of family 1.1 according to information from Gramene Supported by the NSF Plant Genome Research and REU Programs

Viewing your gene of interest You can take the nucleotide sequence and covert it to protein sequence using a program such as Genemark: (http://opal.biology.gatech.edu/GeneMark/eukhmm.cgi) Protein sequence returned: >CV523101_wheat IARIFNTYGPRMCIDDGRVVSNFVAQALRKEPLTVYGDGKQTRSFQYVSDLVEGLMRLMEGDHIGPFNLGNPGEFTMLELAKVVQDTIDPNARIEFRENTQDDPHKRKPDITKAKEQLGWEPKIALRDGLPLMVTDFRKRIFGDQDSAATATEG Supported by the NSF Plant Genome Research and REU Programs

Viewing your gene of interest Paste all of the sequences for family 1.1 (Arabidopsis, rice, and maize) plus the Wheat EST, CV523101_wheat, converted to protein into a ClustalW program such as: http://align.genome.jp/ from the Kyoto University Bioinformatics center Perform the multiple alignment, copy the Newick tree data generated into a new word file, and save a text file as previously shown Supported by the NSF Plant Genome Research and REU Programs

Viewing your gene of interest Taking the Newick tree from clustalW into TreeDyn as previously shown will allow you to visualize the tree The AUD/SUD clade of the tree generated by TreeDyn shows that the wheat EST (in blue) is most closely related to the rice gene Os05g29990 in the AUD clade The AUD/SUD clade of the family 1.1 tree for Arabidopsis (red), Rice (black), Maize (green), and a wheat EST (blue) added to demonstrate how you can visualize relatedness of your own genes using our protein sequences Supported by the NSF Plant Genome Research and REU Programs

Bioinformatics sites used General Multiple alignment for trees, ClustalW (http://align.genome.jp/) Making trees, TreeDyn (http://www.treedyn.org/) BLASTing NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) Proteins translated by GeneMark (http://opal.biology.gatech.edu/GeneMark/eukhmm.cgi) Rice Sequence BLAST using TIGR (http://www.tigr.org/tdb/e2k1/osa1/) Downloading rice protein sequences from TIGR (http://www.tigr.org/tdb/e2k1/osa1/LocusNameSearch.shtml) Maize Sequence BLAST using TIGR ZmGI (http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=maize) Sequence BLAST using Gramene (http://www.gramene.org/multi/blastview) Supported by the NSF Plant Genome Research and REU Programs