Download presentation
Presentation is loading. Please wait.
Published byGuadalupe Scripture Modified over 10 years ago
1
Weixi Zhong Mentor: Dr. Andrew Cameron Center for Computational Regulatory Genomics California Institute of Technology
2
Set up an accessible database for E. tribuloides transcriptome Compare the quality of Eucidaris tribuloides RNA sequence assemblies Choose best assembly Create sequence database Create web interface to access database Facilitate future E. tribuloides gene studies Share findings on E. tribuloides transcriptome Extensions after further research (i.e. more search options, feedback, etc.) 1. Image courtesy of http://www.peteducation.com/ Image 1.
3
Strongylocentrotus purpuratus Only Echinoderm with fully sequenced genome Evolutionarily closer to humans than many other model organisms used in developmental biology Eucidaris tribuloides Distant relative of S. purpuratus (~275 my) Useful in comparative studies Image 2. 2. Image courtesy of SpBase (http://www.spbase.org/)
4
Gene regulatory differences? S. purpuratus E. tribuloides *Red arrows point to mesenchyme cells, which develop later in E. tribuloides than other sea urchins; circles indicate location of blastopore Image 3. 3. Image courtesy of http://www.palaeos.com/ Microscope images of sea urchin gastrula courtesy of Dr. Andrew Cameron
5
No available E. tribuloides genome Assemble transcriptome: RNAcDNA Solexa reads Velvet assembly Expression studies Quality comparison! Early Et gastrulaDatabase
6
High-throughput short read sequencing technology cDNA.:AGGTCTTAC.: Sequenced reads
7
De novo genome assembly software developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI) in UK Contig – sequence of a set of contiguous overlapping reads Contigs from a single velvet run assumed to be unique and non-overlapping Information from http://www.ebi.ac.uk/~zerbino/velvet/ SOLEXA reads Contigs AGCAT GCATA GCATA CATAC CATAC ATACC ATACC TACCT TACCT ACCTG ACCTG CCTGT CCTGT CTGTA CTGTA TGTAA TGTAA AGCATACCTGTAA
8
Assess quality of assembly using length distribution: n50 and 90% complexity calculations N50—length of shortest contig such that the summed length of equal or longer contigs constitute at least 50% of the total length of all contigs* 90% complexity—similar, assuming unique contigs *n50 definition based on definition by Jeremy Leipzig (http://jermdemo.blogspot.com/2008/11/calculating-n50-from-velvet-output.html) n50
9
Use S. purpuratus proteome as reference Map contigs to proteome Using proteome “removes” silent mutation differences between genes Record metadata : count of matches, annotated matches, unique matches GLEAN3_00299: LEU-MET-TYR-PHE-GLU-GLY-CYS-LEU-LYS S. purpuratus: CTC-ATG-TAC-TTC-GAG-GGA-TGC-TTG-AAG E. tribuloides: TTG-ATG-TAT-TTT-GAA-GGA-TGC-CTG-AAA
10
Create database using PostgreSQL User information table Contig information table Gene information table Contig-gene match information table Sequences Write webpage to access database Ability to search using both species Display in text and graphical formats *Sample database site courtesy of Autumn Yuan
11
Eucidaris tribuloides RNA Sequence Database Search Sp genome search results Contig information popup Gene match information graphical display Et contigs search results Match details popup Tabular display Search history Change display order
12
Conduct research using database information Share data with researchers through website Add functionality to website as research findings evolve
13
Special thanks to: The SoCalBSI faculty and staff Dr. Jamil Momand, Dr. Sandy Sharp, Dr. Nancy Warter-Perez, Dr. Wendie Johnston, Dr. Beverly Krilowicz, and Ronnie Cheng My mentor: Dr. Andrew Cameron The CCRG staff Autumn Yuan, Dong He, Dave Felt All the SoCalBSI interns Funded by:
15
Search using either species Choose search criterion Enter search terms Narrow down results Examples
16
Sp gene name with link to match display pageOfficial identifier for this sequence in the Sp genomeLink to result page for all Et contigs that match to this gene Link to SpBase page for this gene
17
Contig name, link to popup with contig informationContig lengthTop SPU matches for this contigCorresponding genes if existent, with link to display pageBlastx score and e-value
18
Contig name Contig length Contig coverageContig sequence
19
Gene name Basic gene information and links to more comprehensive webpages Change display format Alignment Link to popup with detailed alignment
20
Change display format Tabulated alignment summary Link to popup with detailed alignment
21
Alignment details Contig information
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.