Download presentation
Presentation is loading. Please wait.
Published byJean Ray Modified over 9 years ago
1
July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist rnash@stanford.edu
2
July 2015 CSHL Outline History and background How to stay current Basic org. (homepage, search, LSP) Tabs, access to detailed info (sequence, gene ontology, phenotype, interaction, expression and regulation) Data analysis: GO tools, YeastMine basics and use-cases
3
July 2015 CSHL About SGD Totally public, open, non-profit academic group Funded by the NIH (NHGRI) Mike Cherry at Stanford is the P.I. (since 1992). Most of SGD is housed at Stanford, with a few remote curators who work from home
4
July 2015 CSHL Key early decisions People who understand the biology (Ph.D. biologists) are required to design the database, summarize the literature, etc. Full-time staff positions are needed for project stability. Our top priority is to serve the needs of the research community (yeast and other), so communication with users is critically important.
5
July 2015 CSHL SGD Today Over 1.7 million visits from unique IP addresses over the past year; 175,000 page views per week; worldwide usage About 15 full-time staff (curators, programmers, system and db admins) “Other” represents 30 countries with more than 100 visits, and 49 additional countries with 10-100 visits.
6
July 2015 CSHL SGD Staff, Cherry lab
7
July 2015 CSHL Search YeastMine YouTube tutorials New data and updates Research spotlight Upcoming meetings Analysis and seq. tools Functional information Literature Community Colleague Info. Gene registry Wiki Newsletter Social Media: Facebook Twitter Linked in Basic organization of information on the home page
8
July 2015 CSHL Elastic search with autocomplete Gene names (ACT1) => Locus Summary page Other terms (actin; “act1 *”) => Instant Search page Some IDs direct: 5634, 25721128 Single quote (OR) vs double quotes (AND)
9
July 2015 CSHL Modify your search Autocomplete (suggestions) Instant search (predictive results) Next iteration to include facets!
10
July 2015 CSHL Website redesign: staying current and modern To store new data and leverage new web development tools, SGD was completely overhauled. Restructured pages, data transfer methods, and underlying database schema, all done while keeping the site live and actively curated. Goal was to make the website faster, and easier to maintain New visualization methods, and a responsive layout.
11
July 2015 CSHL Locus Summary Page Responsive layout: better for all devices Organization: moved seq. info up + improved graphics some basic protein info. regulation summary Improved expression histogram Navigation has changed: Sectional nav. bar with back to top tabs and details link New tabs for seq. and locus history
12
July 2015 CSHL What’s behind the tabs?
13
July 2015 CSHL Sequence details S288C overview –map –subfeatures, with coordinates –sequence (genomic, coding and protein) Alternative reference strains –map –subfeatures, with coordinates –sequence (genomic, coding and protein) Other strains
14
July 2015 CSHL Other ref strains Alternative ref strains
15
July 2015 CSHL Sequence tools BLASTN, BLASTP BLASTN vs fungi, BLASTP vs fungi Strain alignment (YRR1) Variant viewer (new)
16
July 2015 CSHL Variant viewer Access from: 1) Sequence (home page navigation bar) -> Strain and species 2) Analyze sequence section of LSP, and 3) resources section of sequence tab
17
July 2015 CSHL Protein details Overview Domains table, and location graphic Shared domains diagram Post-translational modifications Physico-chemical properties External IDs Resources
18
July 2015 CSHL The Gene Ontology (GO) Project A collaboration among model organism databases, initiated in 1998 by a consortium of researchers from FlyBase, SGD, and MGD, to improve queries within and across databases. The problem across databases: “Biologists would rather share their toothbrush than share a gene name. Gene nomenclature is beyond redemption” - Michael Ashburner
19
July 2015 CSHL S. cerevisiae CDC25 Son of Sevenless D. melanogaster SOS1 H. sapiens fructose-bisphosphate aldolase = 1,6- diphosphofructose aldolase = D-fructose-1,6- bisphosphate D-glyceraldehyde-3-phosphate- lyase = diphosphofructose aldolase = fructoaldolase = fructose 1,6- diphosphate aldolase = fructose 1-monophosphate aldolase = fructose 1- phosphate aldolase = fructose diphosphate aldolase = fructose-1,6-bisphosphate triosephosphate-lyase = ketose 1-phosphate aldolase = phosphofructoaldolase = zymohexase Neither genetic names nor common names are consistently used = =
20
July 2015 CSHL The solution: GO, a set of three independent structured, controlled vocabularies for describing the molecular function, biological process, and cellular component of gene products Molecular function: the tasks performed by individual gene products, for example, fructose-bisphosphate aldolase activity or protein serine/threonine kinase activity. Biological process: the broad biological goals, such as mitosis or DNA replication, that are accomplished by ordered assemblies of molecular functions. Cellular component: subcellular structures, locations, and macromolecular complexes, such as nucleus, cellular bud tip, and origin recognition complex.
21
July 2015 CSHL GO Annotation Details GO Summary Biological Process Molecular Function Cellular Component
22
July 2015 CSHL Phenotype details Use SGD search to locate observables and ALL text Browsable list of all phenotypes
23
July 2015 CSHL Interaction details Operations sort filter analyze
24
July 2015 CSHL Expression details
25
July 2015 CSHL SPELL expression tool See expression of an individual gene in selected dataset(s) Enter a set of genes and find genes with similar expression profiles (optional filtering by tags)
26
July 2015 CSHL Regulation details Overview Domains/classificati ons Targets Shared GO for targets Regulators
27
July 2015 CSHL Biochemical Pathways
28
July 2015 CSHL Gbrowse Navigation: * landmark * scrolling * zooming Selecting: * tracks * subtracks
29
July 2015 CSHL Navigation: Region (chrVI:48,978..58,977), gene name (CDC28), keyword (invasive growth) Highlighted rectangle in overview is region of genome displayed in detail panel Region panel displays a portion of the genome surrounding the region of interest Detail panel displays zoomed in view that corresponds to the overview selection rectangle Select tracks: SGD Annotationssequence features Chromatin structurehistone modifications, nucleosome org. Gene Structuretranscription start sites, 5’ and 3’ UTRs RNA expressionmRNA, ncRNA, cell cycle Replication and Recomb’nmeiotic recomb’n, origins of replication Transcription Regulationtxn factors, RNAPII, preinitiation factors Analysisrestriction sites
30
July 2015 CSHL Data files for download
31
July 2015 CSHL Search full text with Textpresso
32
July 2015 CSHL Genome Snapshot: global questions about the genome and its annotation status
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.