EuPathDB: an integrated resource and tool for eukaryotic pathogen bioinformatics Aurrecoechea C., Heiges M., Warrenfeltz S. for the EuPathDB team CTEGD,

Slides:



Advertisements
Similar presentations
PubMed/How to Search, Display, Download & (module 4.1)
Advertisements

Welcome to informaworld TM. The following demo will show you just a few of the features on informaworld TM. Please select where you would like start. ePublication.
MY NCBI (module 4.5).
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
Part 2 – Using Limits in PubMed. Instructions This part of the course is a PowerPoint demonstration intended to show a guided tour of the PubMed interface.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
WELCOME TO THE ANALYSIS PLATFORM V4.1. HOME The updated tool has been simplified and developed to be more intuitive and quicker to use: 3 modes for all.
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
Navigation on Journal of Dairy Science ® An Overview June 2011.
1 SRI International Bioinformatics Advanced PGDB Editing: Regulation GO Terms Ingrid M. Keseler Bioinformatics Research Group SRI International
MyiLibrary® ‘Search & View’ Website Training June 8, 2010.
What’s New in Visio 2007 Office Visio 2007 is easy to use and comes with diagram- specific shapes and tools that enable you to quickly create professional-looking.
Controller View (web) Model Model T HE E U P ATH DB / GUS-WDK S EARCH S TRATEGY S YSTEM Cristina Aurrecoechea 1, Brian P. Brunk 2, Steve Fischer 2, Xin.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
HELP… Login Enter your username and password here. Alternatively, click on the Athens login link below.
Your New FSU EMarket “Before and After” Guide Shopping, Favorites, and More...
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Part 4 – Preview/Index, History, combining search sets, Accessing full text articles and restricting results to the HINARI subset of journals. Instructions.
SAGExplore web server tutorial for Module II: Genome Mapping.
Copyright OpenHelix. No use or reproduction without express written consent1.
is accessible at: The following pages are a schematic representation of how to navigate through ALE-HSA21.
Use cases for Tools at the Bovine Genome Database Apollo and Bovine QTL viewer.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
1 OPOL Training (OrderPro Online) Prepared by Christina Van Metre Independent Educational Consultant CTO, Business Development Team © Training Version.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
Drinking Water Infrastructure Needs Survey and Assessment 2007 Training.
Drinking Water Infrastructure Needs Survey and Assessment 2007 Website.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
SAGExplore web server tutorial for Module I: Genome Explore.
IPortal Bringing your company and your business partners together through customized WEB-based portal software. SanSueB Software Presents iPortal.
Data Mining in Ensembl with BioMart Nov,
1 / 61 Using the Customer Support Web Site © 2006, Universal Tax Systems, Inc. All Rights Reserved. Customer Support Site Objectives –In this chapter you.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
EMARS 3.9 Familiarization November Logging In Larger Font No overlap of text and picture.
Data Mining in Ensembl with BioMart Giulietta Spudich.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
PubMed/How to Search, Display, Download & (module 4.1)
Navigating Selection Manager –
Data Integration & Data Mining Tool Donald Dunbar BHF CoRE Bioinformatics Team Edinburgh Bioinformatics Meeting April 2013.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Classwork: Common Errors Primary keys: don’t forget them! Primary keys: choose the best one! – “Name” and “birthday” are not the best choices. – “Phone.
SAGExplore web server tutorial. The SAGExplore server has three different modules …
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Copyright OpenHelix. No use or reproduction without express written consent1.
Welcome to the combined BLAST and Genome Browser Tutorial.
PubMed/Preview, Index & History; Accessing Full-Text Articles (module 4.4)
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
The TDR Targets Database Prioritizing potential drug targets in complete genomes.
Using the Result List EBSCOhost
SAGExplore web server tutorial for Module III:
EPConDB: Endocrine Pancreas Consortium Database
Central Document Library Quick Reference User Guide View User Guide
PubMed Database Interface (Basic Course Module 4 Part A)
ID Mapping tools: Converting Accessions between Databases
Introducing The Knowledge Network
Using the Result List EBSCOhost
Explore Evolution: Instrument for Analysis
Welcome to the GrameneMart Tutorial
About CGD/ Getting Started
Welcome - webinar instructions
Presentation transcript:

EuPathDB: an integrated resource and tool for eukaryotic pathogen bioinformatics Aurrecoechea C., Heiges M., Warrenfeltz S. for the EuPathDB team CTEGD, University of Georgia, Athens, GA USA ABSTRACT: EuPathDB ( is an integrated bioinformatics database covering several eukaryotic pathogens. Genera represented are Cryptosporidium, Encephalitozoon, Entamoeba, Enterocytozoon, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma, and the newly added Theileria and Babesia. Each of these groups is supported by a taxon- specific database and web interface which can be accessed independently of EuPathDB. EuPathDB provides a portal to all these databases, and the opportunity to leverage orthology for searches across genera. The databases are updated and expanded about every 2 months, providing online access to the latest genomic-scale datasets including complete genome sequences, annotations, and functional genomics such as proteomics, microarray, RNA-Seq, ChIp-chip, SAGE and EST data. The specific advantage of the EuPathDB databases lies in the graphical search interface that allows users to combine datasets while building a search strategy. Multistep searches strategies are built one step at a time choosing from more than 100 searches. The latest EuPathDB release debuts a search for DNA motifs and a method of combining searches based on relative genomic location. This new operation allows the results of successive steps to be combined based on each feature’s location relative to other features. Parameters defining upstream/downstream distances and gene overlap restrict the search results in a way that highlights biologically relevant relationships such as antisense transcription and promoter sharing. The merger of EuPathDB’s user-friendly search strategy system with full and up-to-date databases offers researchers a powerful tool for data mining during computational experiments. E. Dispar, E. histolytica, E. invadens C. hominis, C. muris, C. parvum G.lamblia, G.assemblage_B, G.assemblage_E E.cuniculi, E.intestinalis, E.bieneusi, N.parisii, O.bayeri B.bovis, T.annulata, T.parva P.berghei, P.chabaudi, P.falciparum, P.gallinaceum, P.knowlesi, P.vivax,P.yoelii N.caninum, T.gondii T.vaginalis C.fasciculata, L.braziliensis, T.cruzi L.infantum, L.Major, L.mexicana, T.vivax L.tarentolae, T.brucei, T.congolense ● Quick access to ID and text search options, login, contact, twitter, etc. ● Portal to EuPathDB databases by clicking on icons ● Main Header Tab Bar: mouse-over ‘New Search’ to initiate searches; click ‘My Strategies’ to enter your workspace ● Initiate searches from center panels. Over 100 search types available. ● Identify Genes by: look for Genes based on a variety of datasets, including whole genome sequence, coding vs non-coding genes, transcript evidence (microarray, EST), exon count, etc. ● Identify Other Data Types: Look for ESTs, SNPs or DNA motifs; ● Tools: Access tools like Blast and PubMed from any EuPathDB home page Building search strategies: New way of combining searches based on relative genomic location: ● Graphical representation of your search strategy. Each step can be revised by clicking on the step name. Searches return a list of IDs (genes, ESTs, SNPs, proteins) that satisfy the conditions of your query parameters. This gene search for protein coding genes in P. falciparum returned 5418 gene IDs. Taxon specific databases provide access to the latest available genome-scale datasets. Built with the same web-architecture, search types and functions are the same across all databases. ● Filter table showing the distribution of gene IDs across all species in the database. ● Results table with ID as the first column. Columns can be added, changed, deleted or sorted. Entire table can be downloaded as Excel or other formats. ● Click on the ID name to access details in that ID’s record page. The search generates a step and the results below show the list of genomic segment IDs corresponding to the locations of EcoR1 site: a segment ID for each occurrence of GAATTC in the genome. Search for DNA Motifs such as restriction enzyme sites or transcription factor binding sites. Choose Genomic segments, DNA Motif Pattern : 1 Initiate the search. It will find all occurrences of GAATTC in the genome. 2 3 New Search Type: DNA Motif Pattern 1.Run a query choosing from more than 100 searches.  Build strategies for several data types: genes, ESTs, SNPs, ORFs, etc. 2.Add a step – run a second query combining results with previous searches.  Query the results of Step 1 based on functional genomics.  Nest strategies to build complexity 3. Add more steps… Run a Search. This search for all protein coding genes in P. faliciparum returned 5418 genes. 1 2 Add a step. The second search here, based on DNA motif, searches for the EcoR1 restriction enzyme site. 3 Combine search results using the co-location function. 4 Carefully consider the 5 user-defined parameters in the logic statement of the co-location function. 5 View results. The results table lists 214 IDs of genes whose upstream 500bp region contains the EcoR1 site. The column ‘Matched Regions’ defines the genomic location of the EcoR1 site within the gene. i.Return IDs from either step 1 or 2. ii.Define relative location (“Region”) of the returned data type. Search the exact region, upstream, or downstream of the returned data type. iii.Define relationship between step 1 and 2 results’ regions: contains, overlaps, or is contained in. iv.Define relative location (“Region”) of the other (non-returned) step result. v.Define strand to be considered in the operation: either, same or both. i ii iii iv v Graphical search interface motivates users to prioritize search results based a variety of data types. The search strategy system provides the opportunity to explore and identify biologically meaningful relationships