Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome - webinar instructions

Similar presentations


Presentation on theme: "Welcome - webinar instructions"— Presentation transcript:

1 Welcome - webinar instructions
The webinar will start soon GoToTraining works best in Chrome or on Linux, Firefox All microphones will be muted while the trainer is speaking If you have a question please use the chat box at the bottom of the GoToTraining box Please complete the feedback survey which will launch at the end of the webinar The webinar will be recorded and added to Train online

2 An Introductory Webinar Wojtek Bazant & Faye Rodgers

3 Outline Why WormBase ParaSite? Our genomes Data available BioMart
Questions

4 Why WormBase ParaSite? Helminths (parasitic roundworms and flatworms) are the causative agents of many diseases of humans, animals and plants Increasing amounts of genomic data are becoming available to the helminth research community WormBase ParaSite processes and presents that data in a consistent and accessible way

5 Analyses run for all genomes
Genomes and primary annotation (from the community) Analyses run for all genomes Protein domain prediction, GO term annotation, repeat annotation, ncRNA annotation, alignment of publicly available RNASeq data, linking IDs to external databases Comparative analysis Build gene trees incorporating all genomes in the release (plus comparators) to predict orthologues and paralogues. Mention that Website - browsing Gene and species pages JBrowse Website - tools BLAST BioMart REST API

6 Structure and features of the front page

7 Our genomes

8 Genome and species descriptions

9 Finding information related to your scientific question
If you know the gene name or ID, it’s just a search task! Otherwise, it more like research. Common avenues: BLAST the sequence Text search to try match a gene description Search through a protein feature or GO term Navigate through an orthologous gene in other species home page with some red circles

10 Data available for each gene

11 Transcript and protein pages

12 Data available for each gene

13 “Region in detail” - embedded genome browser

14 Alternative genome browser – JBrowse
Better for a workbench view with multiple tracks

15 Data available for each gene

16 Links and references - UniProt etc.

17 Literature

18 Comparative Genomics Gene trees are computed with every release, classifying genes into families. These are reconciled with the species trees to infer orthologous and paralogous relationships. Speciation node Duplication node Tree views can be configured for exploring the gene family

19 Comparative Genomics Eg, highlight all of the paralogues:

20 Comparative Genomics Orthologues and paralogues are also available in tabular format: Lists can be exported from BioMart Full gene trees can be accessed programmatically via the API

21 BioMart A very powerful tool for accessing data in bulk without any programming knowledge. Filters The data type you’re basing your query on, eg: Genome Genomic region A list of gene IDs All genes annotated with a protein domain or a GO term All genes that have an orthologue in a species Values The actual data you’re basing your query on, eg: Schistosoma mansoni PRJEA36577 Schistosoma mansoni Sm_V7_1 Smp_035270, Smp_010250, Smp_244010… SignalP Genes with an orthologue in Schistosoma haematobium Attributes The data you want, eg: Protein stable IDs cDNA sequences Uniprot IDs Protein domains Orthologue names, % identity Filters can be combined to build more complex queries

22 BioMart Walk-through example: using BioMart to retrieve S. mansoni genes from the ZW chromosome that have an orthologue in S. japonicum and S. haematobium. Want to return the S. mansoni, S. haematobium and S. japonicum gene IDs.

23 To access BioMart from the home page

24 Add a species filter

25 Add a region filter

26 Add homology filters

27 Count how many genes fulfil our filter criteria

28 Select output attributes

29 Previewing the results we get by default

30 Add orthologues to output attributes

31 Scroll down to find the species that we’re interested in

32 View a preview of your output, and download full results.

33 BioMart Other examples of questions that can be answered with BioMart:
For a list of gene IDs: Convert to other types of identifier (Uniprot, RefSeq, NCBI) Retrieve associated protein domains, GO terms Retrieve their genomic coordinates Generate FASTA files of protein, cDNA, UTR, flanking region sequences etc Retrieve a list of genes that: Have a given protein domain/GO term Have/do not have orthologues in species X,Y,Z. Are on genomic region X For R users, WormBase ParaSite BioMart supports the biomaRt R package: see our help and documentation pages to get started.

34 Outline Why WormBase ParaSite? Our genomes Data available BioMart
Questions

35 If we don’t get to your question: email parasite-help@sanger.ac.uk
Outline Why WormBase ParaSite? Our genomes Data available BioMart Questions If we don’t get to your question:

36 Sample question The suggested option Other, more creative approaches?
I need the sequences for a set of Schistosoma mansoni genes. I have the chromosome, start, and stop for each. The suggested option Other, more creative approaches? download the GFF and the sequence files from the FTP, and write a program check the cases one by one use the API, first „region” endpoint to get gene IDs, then „sequence” endpoint the helpdesk ( it might work )

37 BioMart Example 2 Using BioMart to generate a protein FASTA file from a list of gene IDs

38 Select filter(s).

39 Paste in gene IDs.

40 In output attributes, select “Retrieve sequence”

41 Select the type of sequence we’re interested in.
Select the information we’d like in the FASTA header.

42 Preview and download output.

43 Upcoming webinars Don’t forget!
See the full list of upcoming webinars at Don’t forget! Please fill in the survey that launches after the webinar – thanks!


Download ppt "Welcome - webinar instructions"

Similar presentations


Ads by Google