SEQUENCE RETRIEVAL SYSTEM SEQUENCE RETRIEVAL SYSTEM SRS SRS Ashwin Sivakumar, 02/12/03 Ashwin Sivakumar, 02/12/03 Hands on Workshop on Protein Analysis.

Slides:



Advertisements
Similar presentations
EBSCO Discovery Service
Advertisements

To print your results, click on the printer icon. Choose from the printing options suggested. You can choose to remove items from folder after printing.
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
We now view the Display Settings drop down menu for the Medline Format, 20 Items per Page and Recently Added Sort by options. To display click on the Apply.
For Details Visit : or For any Help Contact the Librarian EBSCOhost 2.0.
Web Store Training. Table of Contents Sign In : Accessing the site My Profile : Managing your account Catalog Navigation : Finding items and ordering.
Online Search Mehdi Osooli M.S.C in Epidemiology Department of Epidemiology & Biostatistics School of Public Health Tehran University of Medical Sciences.
Microsoft Excel 2003 Illustrated Complete Excel Files and Incorporating Web Information Sharing.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Information & Library Services Australian Education Index, British Education Index and ERIC Sally Giffen August 2006.
Access Lesson 2 Creating a Database
WebSPIRS SilverplatterARC2 Databases include: British Nursing Index – Cinahl – ERIC – Geobase – GeoRef – ICONDA – Mathsci – Medline – SERFILE – Sociological.
XP Chapter 3 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Analyzing Data For Effective Decision Making.
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
HKUHKU Computer Centre Introduction to SRS Frankie Cheung
Tutorial 11: Connecting to External Data
Access Tutorial 3 Maintaining and Querying a Database
An introduction to using the AmiGO Gene Ontology tool.
Working with the Conifer_dbMagic database: A short tutorial on mining conifer assembly data. This tutorial is designed to be used in a “follow along” fashion.
Web of Science. Copyright 2006 Thomson Corporation 2 Example: (bird* or avian) and (flu or influenz*) Enter your terms to be searched. Search fields are.
XP New Perspectives on Microsoft Access 2002 Tutorial 41 Microsoft Access 2002 Tutorial 4 – Creating Forms and Reports.
Recruitment Office Procedures Job Posting Requests Creating a Search Committee –Adding Search Committee MembersAdding Search Committee Members –Designating.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
Support.ebsco.com EBSCOhost Basic Searching for Academic Libraries Tutorial.
Tabs to main publication types Links in the orange navigation bar for: News Librarians Users Guide Price List alerts 1. Top Navigation Bar General.
Lorie Stolarchuk Learning Technology Trainer 1 What has changed with the 2.7.X Upgrade to CLEW?
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Creating a Web Site to Gather Data and Conduct Research.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Analyzing Data For Effective Decision Making Chapter 3.
PubMed/How to Search, Display, Download & (module 4.1)
Part 1 – PubMed Interface, Display options, Saving, Printing, and ing results. Instructions This part of the course is a PowerPoint demonstration.
1 OPOL Training (OrderPro Online) Prepared by Christina Van Metre Independent Educational Consultant CTO, Business Development Team © Training Version.
About the OECD Why am I here? Why is access to online information important? Libraries and Librarians play a crucial role in the innovation process.
1 By: Nour Hilal. Microsoft Access is a database software where data is stored in one or more Tables. A Database is a group of related Tables. Access.
Access 2013 Microsoft Access 2013 is a database application that is ideal for gathering and understanding data that’s been collected on just about anything.
® Microsoft Office 2010 Access Tutorial 3 Maintaining and Querying a Database.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
FIX Eye FIX Eye Getting started: The guide EPAM Systems B2BITS.
Rev.04/2015© 2015 PLEASE NOTE: The Application Review Module (ARM) is a system that is designed as a shared service and is maintained by the Grants Centers.
MetaLib 4 User Guide. 2 MetaLib 4 Access MetaLib at: – MetaLib may be used at two different levels –
PubMed Overview From the main HINARI webpage, we can access PubMed by clicking on Search HINARI journal articles through PubMed (Medline). Note: If you.
SRS Introductory Course 5/12/ Temporary and permanent sessions - Simple querying - Browsing indices - Standard and extended query forms - User defined.
1 EndNote X2 Your Bibliographic Management Tool 29 September 2009 Humanities and Social Sciences Resource Teams.
Table of Contents TopicSlide Administrator Login 2 Administrator Navigations 3 Managing AlternativeDr.com Blogs 4 Managing Dr. Lloyd May Blogs 5 Managing.
®® Microsoft Windows 7 for Power Users Tutorial 3 Managing Folders and Files.
We now will look at options for saving searches in CINAHL. We have accessed the Results for Chloroquine AND Pyrimethamine AND Sulfadoxine search. We now.
IN THE NAME OF GOD. Reference Citing Software.
Copyright OpenHelix. No use or reproduction without express written consent1.
Advanced SRS Course 12/12/02 -Linking -Subentries -Applications.
PubMed/How to Search, Display, Download & (module 4.1)
Key Applications Module Lesson 22 — Managing and Reporting Database Information Computer Literacy BASICS.
Microsoft Office 2013 Try It! Chapter 4 Storing Data in Access.
SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen. Motivation Structural biology molecular biology genetics medicine Sequencing information physiology toxicology.
Access Queries and Forms. Adding a New Field  To insert a field after you have saved your table, open Access, and open the table  It is easier to add.
PubMed/Preview, Index & History; Accessing Full-Text Articles (module 4.4)
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
PubMed/How to Search, Display, Download & (module 4.1)
KARES Demonstration.
Citation format 1 The “Citation” display option is similar to the abstract display but has some extra information such as MeSH terms and substances listed.
We now will look at options for saving searches in CINAHL
CAB Abstracts, Medline & Zoological Record
Finding Magazine and Journal Articles in
EBSCOhost Advanced Search Guided Style
Tutorial Introduction to help.ebsco.com.
Presentation transcript:

SEQUENCE RETRIEVAL SYSTEM SEQUENCE RETRIEVAL SYSTEM SRS SRS Ashwin Sivakumar, 02/12/03 Ashwin Sivakumar, 02/12/03 Hands on Workshop on Protein Analysis (HOW) Hands on Workshop on Protein Analysis (HOW)

Temporary session Permanent session Database Information - which are present -when indexed Documentation List of public servers

What is SRS? Central resource for molecular biology data Data retrieval system - more than 250 databanks have been indexed. More than 35 SRS servers over the WWW Data analysis applications server - 11 protein applications - 11 protein applications - 6 nucleic acid applications - 6 nucleic acid applications - Uniform query interface on the web

Data Jungle Structural biology molecular biology genetics medicine Sequencing information physiology toxicology gene expression

History of SRS Main author Dr. Thure Etzold –Development started in EMBL, Heidelberg 1997 –Moved to EBI in Cambridge. Development work was supported by various grants amongst others from the EMBnet –Etzold and his group join LionBiosciences

Why SRS? Information retrieval –Easy way to retrieve information from sequence and sequence-related databases –Possibility to search for multiple words/other criteria Linkage between different databases –E.g. Find all primary structures with known three-dimensional structure... and much more

Philosophy of SRS Original database file -plain text, html, xml Data Retrieval Searchable links between database entries Index file parsed

Temporary Projects Queries and views are stored by the project manager temporarily Temporary sessions last 24 hours Useful when you: –Do not need to keep your results –look something up quickly –Run an occasional application Click on ‘Start’ paw on SRS start page

Permanent Projects Queries and views are stored by the project manager in a single location They are available for use in the future Useful when: –You want to return to a session –Want to have many projects in the same session Begin by clicking ‘Permanent session’ paw on SRS start page –Just need to enter an SRS user name and re- enter this to return to same session again later

WorkbenchesQuery Forms Library groupsLibrary groups Libraries The Library Select Page

SRS main toolbar tabs Top Page: displays databases in different database groups Query: displays either the standard or extended query form Results or “the query manager”: maintains a history of all the results obtained during a session Projects or “the project manager”: maintains a history of all queries and views used during a session Views: allows a user to define a user specific view for one or more databases Databanks: contains a list and some facts about the databases available in the system

Search terms in SRS SRS indexed fields can be searched using any of the following: –Single word search –Multiple word phrases –Numbers and dates –Regular expressions –Wildcards

Search methods Quick search button: –Works by searching all datafields of type text –The quickest way to generate query results –For very general/broad searches Example: get all mouse and mouse related proteins in SWISS-PROT All Entries button: –Returns all entries in the database selected Search forms : allow you to specify your area of interest in more detail –Standard query form –Extended query form

Standard query form Enter up to 4 separate search terms against up to 4 datafields simultaneously Combine entries with logical operators ( and & or | butnot ! ) Choose the number of entries to display per page Retrieve entries of type (entry or subentry(name)) Choose a view –use an SRS predefined view –create one of your own by selecting specific fields from a dropdown menu (and choose whether to view a list or table in SRS7)

Query Fields Predefined views User defined views The Standard Query Page

Extended query form Can enter search terms for as many fields as you want Combine searches with logical operators ( and & or | butnot ! ) Choose how many results to display per page Choose view and sequence format to use –Can choose an SRS predefined view –Define your own view by clicking the boxes next to the fields that you want to have displayed (list or table option in SRS7) Each field name has a hyperlink to the description page for that field Form provides less than ‘ ’ for numerical fields Choose what type of entries to retrieve (entry, subentry (name)) – on extended form if you query a subentry field, it defaults to returning results of type subentry

Extended query page Fields Predefind views User defined view

Differences in these 2 forms Ranges – standard must use ‘:’ –extended provides ‘ ’ Type retrieval – standard defaults to retrieving entries of type ‘entry’ –extended defaults to retrieving entries of type entry unless you query a subentry field in which case the default is the subentry type Controlled vocabulary fields –standard does not provide you with a list for these fields –extended provides a drop down menu for these fields allowing you to select an option

Wildcards These are useful when: –Searching for a group of words (eg. Words starting ‘cell’ and ending ‘ase’ : cell*ase) –If unclear about how a word is spelt in a database Two types: –* one or more characters of any value –? Single character of any value Any number of wildcards can be placed anywhere in a search word Placing a wildcard at the start of a word or string may increase response time because all words in the index have to be checked against the string

Regular expressions NB: Must appear within forward slashes (/) Some operators: ^ marks the start of a string /^glu/ begins with ‘glu’ ^ marks the start of a string /^glu/ begins with ‘glu’ $ marks the end of a string /ase$/ ends with ‘ase’. dot is any single character […] characters in square brackets are regarded as a set, any of which can be matched [0-9] specifies a range of 1 to 9 * the preceding group may be repeated zero or more times + the preceding group may be repeated one or more times ? The preceding character/group occurs one or zero times

Some examples /^glu/ will find terms beginning with ‘glu’ /ase$/ will find terms ending with ‘ase’ /c.t/ will find the words cat, cot, cut……. /c.*t/ will find terms beginning with ‘c’ and then any number of characters and ending with ‘t’ then any number of characters and ending with ‘t’ /sm[iy]th/ will find the words ‘smith’ or ‘smyth’ /rho[1-9]/ will find the word ‘rho’ followed by a number from 1-9 /mue?ller/ will find ‘muller’ or ‘mueller’ NB. The ‘*’ symbol has two meanings: -within forward slashes ‘/’ it means the preceding group may be repeated zero or more times repeated zero or more times - outside forward slashes it means any character - outside forward slashes it means any character

Numerical ranges In a numerical index it is possible to search numerical ranges - sequence lengths, mol. weights, dates…. - sequence lengths, mol. weights, dates…. the ‘:’ is used for specifying ranges and ‘!’ for excluding values the ‘:’ is used for specifying ranges and ‘!’ for excluding values –400:500 all seq. with length between 400 and 500 –400: all seq. with lengths greater than 400 –:500 all seq. with lengths less than 500 –400:!500 all seq. with lengths bet. 400 and 500 excluding 500 Can combine ranges using logical operators – 300:!400 | !500:600 or 300:600 ! 400:500 Dates in SRS have 2 formats: –YYYYMMDD –DD-MMM-YYYY 05-Dec-2002

Some examples –Find entries with sequences having length betwwen 300 and 400 excluding 400 and between 500 and 600 excluding 500: excluding 400 and between 500 and 600 excluding 500: 300:!400 | !500:600 or 300:600 ! 400: :!400 | !500:600 or 300:600 ! 400:500 –Find entries that were created in the first half of 2001: 01-jan-2001:30-jun-2001 or : –Find all entries updated since May this year: 01-may-2002: or :

SRS Indexing SRS indexes database records using a ‘word by word’ approach. - DE Human glutathione transferase -The SRS description index will contain terms ‘human’, ‘glutathione’ and ‘transferase’. (&) AND : ‘human & glutathione & transferase’ (|) OR : ‘human | glutathione | transferase’ (!) BUTNOT : ‘human ! glutathione ! transferase’

EMBL HUMAN glutathione transferase human & glutathione & transferase human & transferase ! glutathione gluthathione & transferase ! human

Databanks information page Lists the databases available in the system and a summary about them: –Number of entries in the database –Date it was indexed –Group it belongs to –Its availability status Hyperlinks to information page specific to each database

Databanks Information Page

Database information page Provides a detailed description about the database contents, source, ftp site, literature… Lists information about the fields that are present in the database including: –Name of field –Short name for field –Type of field index : it is indexed num : indexed and a numerical field id: unique field show: not indexed, just for display –Number of keys for that field –Date it was indexed Lists databases that it is linked to and how many entries are linked respectively

PROSITE information page

Browsing indices This gives information on what is being indexed for a particular field –Single words, multiple words, controlled vocabulary….. To browse an index go to the information page for a particular field from a certain database –If you want to look at all indexed terms use ‘*’ –If you want all terms beginning with trans use ‘trans*’ –If you want all terms containing the string trans use ‘*trans*’

Browsing the description field index for terms beginning with ‘trans’……...

Query manager Found under the results tab Saves a history of results obtained in the session Page allows you to return to previous results and: –Combine them using logical operators – thus allowing you to perform a multi-step query –Use a different view to display them –Perform further actions link, save, delete

The Query Manager My Queries Combine Operators

Project manager Found under the projects tab Saves a history of queries performed in the session Can upload/download SRS session files from a desktop In a permanent session, the project manager can also: –Manage numerous SRS projects at the same time –Move queries/views between projects –Upload/download projects to desktop –Delete projects

Project manager page

User owned databanks Found in the category ‘user owned databanks’ on top page User can upload their own nucleotide or protein sequence data into a user owned database –sequences must be in fasta format –any number of sequences can be uploaded –database is specific to the individual and to the session Can launch applications on database sequences

User owned data

Paste or upload a file Fasta formatted files Any number of sequences Maintained throughout user session

Operations on results Linking : link results to other databases Saving: save results in different formats to the browser or a file Viewing: view results using different formats Sequence analysis: launch applications on the results SRS6 – 11 protein applications, 6 nucleic acid apps. SRS7 – more than 100 applications available

The Results Page Operations

SRS6 versus SRS7 SRS7 provides over 100 applications while SRS6 provides 17 You can retrieve results in either list or table format in SRS7 In SRS6 only the table format is available Current EBI version 7.1.1

SRS6 -- first view Start a new session by clicking here.

Top page  Select one or more databases by ticking the corresponding box  Select type of query form

Different types of database in SRS Sequence & structure –DNA, protein, three-dimensional structures Sequence-relatedGene-related –Genome, mapping, mutations, transcription factors –SNP Bibliographic –Medline, enzyme User-defined

Standard query form  Type text to search for  Select AND or OR if multiple search items are used  Select number of results to show at a time  Select field to search  Submit query

Query result -- table mode  Link sequences to other databases  Accession number, description and sequence length  Mode of viewing can be changed  Possibility to analyse sequences with other tools, e.g. FastA and ClustalW  Hypertext links  Tick boxes to select/deselect sequences for further analyses

Example query Use SRS to answer the following question: For which short-chain dehydrogenases/ reductases (SDR) are the three- dimensional structure known in PDB?

Example, Query form  Enter the search term ”sdr”  Enter in which field to search

Example, Query result  Press the button Link in order to get to the Link page

Link page  You can link in three different ways  In this case, we select to link to PDB  The we select chunk size and view mode  Finally, we press the ”Submit link” button

Link results

Example of a Swissprot entry

Example of a Swissprot entry, cont.  Click this link to get to the corresponding Medline entry (in PubMed)

PubMed entry  By clicking this link, you have the possibility to download the electronic version of the article.

The Top page tab

The Query tab

The Results tab

The Sessions tab

The Views tab

The Databanks tab

Acknowledgements ¤ Bengt Persson MBB, Karolinska institutet (demos) MBB, Karolinska institutet (demos) ¤ 2can tutorial on SRS at EBI csrs.htmlhttp://downloads.lionbio.co.uk/publi csrs.html (The latest SRS server list) csrs.html

server breakup srs.sanger.ac.uk (5) srs.ebi.ac.uk (5) srs.csc.fi (5) titanic.thep.lu.se/srs71/titanic.thep.lu.se/srs71/ (5) titanic.thep.lu.se/srs71/ If you think the load on a server is slowing your query, chose an alternative server to practice on.