Presentation is loading. Please wait.

Presentation is loading. Please wait.

July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist

Similar presentations


Presentation on theme: "July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist"— Presentation transcript:

1 July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist rnash@stanford.edu

2 July 2015 CSHL Outline History and background How to stay current Basic org. (homepage, search, LSP) Tabs, access to detailed info (sequence, gene ontology, phenotype, interaction, expression and regulation) Data analysis: GO tools, YeastMine basics and use-cases

3 July 2015 CSHL About SGD Totally public, open, non-profit academic group Funded by the NIH (NHGRI) Mike Cherry at Stanford is the P.I. (since 1992). Most of SGD is housed at Stanford, with a few remote curators who work from home

4 July 2015 CSHL Key early decisions People who understand the biology (Ph.D. biologists) are required to design the database, summarize the literature, etc. Full-time staff positions are needed for project stability. Our top priority is to serve the needs of the research community (yeast and other), so communication with users is critically important.

5 July 2015 CSHL SGD Today Over 1.7 million visits from unique IP addresses over the past year; 175,000 page views per week; worldwide usage About 15 full-time staff (curators, programmers, system and db admins) “Other” represents 30 countries with more than 100 visits, and 49 additional countries with 10-100 visits.

6 July 2015 CSHL SGD Staff, Cherry lab

7 July 2015 CSHL Search YeastMine YouTube tutorials New data and updates Research spotlight Upcoming meetings Analysis and seq. tools Functional information Literature Community Colleague Info. Gene registry Wiki Newsletter Social Media: Facebook Twitter Linked in Basic organization of information on the home page

8 July 2015 CSHL Elastic search with autocomplete Gene names (ACT1) => Locus Summary page Other terms (actin; “act1 *”) => Instant Search page Some IDs direct: 5634, 25721128 Single quote (OR) vs double quotes (AND)

9 July 2015 CSHL Modify your search Autocomplete (suggestions) Instant search (predictive results) Next iteration to include facets!

10 July 2015 CSHL Website redesign: staying current and modern To store new data and leverage new web development tools, SGD was completely overhauled. Restructured pages, data transfer methods, and underlying database schema, all done while keeping the site live and actively curated. Goal was to make the website faster, and easier to maintain New visualization methods, and a responsive layout.

11 July 2015 CSHL Locus Summary Page Responsive layout: better for all devices Organization: moved seq. info up + improved graphics some basic protein info. regulation summary Improved expression histogram Navigation has changed: Sectional nav. bar with back to top tabs and details link New tabs for seq. and locus history

12 July 2015 CSHL What’s behind the tabs?

13 July 2015 CSHL Sequence details S288C overview –map –subfeatures, with coordinates –sequence (genomic, coding and protein) Alternative reference strains –map –subfeatures, with coordinates –sequence (genomic, coding and protein) Other strains

14 July 2015 CSHL Other ref strains Alternative ref strains

15 July 2015 CSHL Sequence tools BLASTN, BLASTP BLASTN vs fungi, BLASTP vs fungi Strain alignment (YRR1) Variant viewer (new)

16 July 2015 CSHL Variant viewer Access from: 1) Sequence (home page navigation bar) -> Strain and species 2) Analyze sequence section of LSP, and 3) resources section of sequence tab

17 July 2015 CSHL Protein details Overview Domains table, and location graphic Shared domains diagram Post-translational modifications Physico-chemical properties External IDs Resources

18 July 2015 CSHL The Gene Ontology (GO) Project A collaboration among model organism databases, initiated in 1998 by a consortium of researchers from FlyBase, SGD, and MGD, to improve queries within and across databases. The problem across databases: “Biologists would rather share their toothbrush than share a gene name. Gene nomenclature is beyond redemption” - Michael Ashburner

19 July 2015 CSHL S. cerevisiae CDC25 Son of Sevenless D. melanogaster SOS1 H. sapiens fructose-bisphosphate aldolase = 1,6- diphosphofructose aldolase = D-fructose-1,6- bisphosphate D-glyceraldehyde-3-phosphate- lyase = diphosphofructose aldolase = fructoaldolase = fructose 1,6- diphosphate aldolase = fructose 1-monophosphate aldolase = fructose 1- phosphate aldolase = fructose diphosphate aldolase = fructose-1,6-bisphosphate triosephosphate-lyase = ketose 1-phosphate aldolase = phosphofructoaldolase = zymohexase Neither genetic names nor common names are consistently used = =

20 July 2015 CSHL The solution: GO, a set of three independent structured, controlled vocabularies for describing the molecular function, biological process, and cellular component of gene products Molecular function: the tasks performed by individual gene products, for example, fructose-bisphosphate aldolase activity or protein serine/threonine kinase activity. Biological process: the broad biological goals, such as mitosis or DNA replication, that are accomplished by ordered assemblies of molecular functions. Cellular component: subcellular structures, locations, and macromolecular complexes, such as nucleus, cellular bud tip, and origin recognition complex.

21 July 2015 CSHL GO Annotation Details GO Summary Biological Process Molecular Function Cellular Component

22 July 2015 CSHL Phenotype details Use SGD search to locate observables and ALL text Browsable list of all phenotypes

23 July 2015 CSHL Interaction details Operations sort filter analyze

24 July 2015 CSHL Expression details

25 July 2015 CSHL SPELL expression tool See expression of an individual gene in selected dataset(s) Enter a set of genes and find genes with similar expression profiles (optional filtering by tags)

26 July 2015 CSHL Regulation details Overview Domains/classificati ons Targets Shared GO for targets Regulators

27 July 2015 CSHL Biochemical Pathways

28 July 2015 CSHL Gbrowse Navigation: * landmark * scrolling * zooming Selecting: * tracks * subtracks

29 July 2015 CSHL Navigation: Region (chrVI:48,978..58,977), gene name (CDC28), keyword (invasive growth) Highlighted rectangle in overview is region of genome displayed in detail panel Region panel displays a portion of the genome surrounding the region of interest Detail panel displays zoomed in view that corresponds to the overview selection rectangle Select tracks: SGD Annotationssequence features Chromatin structurehistone modifications, nucleosome org. Gene Structuretranscription start sites, 5’ and 3’ UTRs RNA expressionmRNA, ncRNA, cell cycle Replication and Recomb’nmeiotic recomb’n, origins of replication Transcription Regulationtxn factors, RNAPII, preinitiation factors Analysisrestriction sites

30 July 2015 CSHL Data files for download

31 July 2015 CSHL Search full text with Textpresso

32 July 2015 CSHL Genome Snapshot: global questions about the genome and its annotation status


Download ppt "July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist"

Similar presentations


Ads by Google