Presentation is loading. Please wait.

Presentation is loading. Please wait.

SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen. Motivation Structural biology molecular biology genetics medicine Sequencing information physiology toxicology.

Similar presentations


Presentation on theme: "SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen. Motivation Structural biology molecular biology genetics medicine Sequencing information physiology toxicology."— Presentation transcript:

1 SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen

2 Motivation Structural biology molecular biology genetics medicine Sequencing information physiology toxicology gene expression

3 Motivation There are 3 main sequence retrieval systems: SRS (highly recommended) Entrez (easier to use but more limited) DBGet (less recommended) This is a workshop on using SRS Start one of the servers below: http://srs.ebi.ac.uk http://csc-fserve.hh.med.ic.ac.uk/srs71 http://walnut.bioc.columbia.edu/srs7/ http://emb2.bcc.univie.ac.at:8080/srs/ http://oryx.ulb.ac.be:8080/srs Full list of srs servers available from: http://downloads.lionbio.co.uk/publicsrs.html

4 What is SRS?: Introduction Central resource for molecular biology data Data retrieval system - more than 250 databanks have been indexed. More than 35 SRS servers over the WWW Data analysis applications server - 11 protein applications - 6 nucleic acid applications - Uniform query interface on the web

5 What is SRS?: History 1990 - Main author Dr. Thure Etzold Development started in EMBL, Heidelberg 1997 Moved to EBI in Cambridge. Development work was supported by various grants amongst others from the EMBnet. 1998 Etzold and his group join LionBiosciences

6 Why SRS? Information retrieval Easy way to retrieve information from sequence and sequence-related databases Possibility to search for multiple words/other criteria Linkage between different databases E.g. Find all primary structures with known three- dimensional structure... and much more

7 Why SRS?

8 SRS construction

9 Comments SRS is both a simple and complicated tool with a number of features. Can take a few days to get accustomed to. We will run through some important features during the lecture. We will apply these features as well as other new ones in the practical session.

10 What can you do in SRS that you can’t do in UniProt Sophisticated searches: eg wildcard searches, regexp searches SRS consolidates multiple databases. Many tools are available in SRS Saving of projects Why bother with Uniprot? Speed.

11 Temporary Projects Queries and views are stored by the project manager temporarily Temporary sessions last 24 hours Useful when you: Do not need to keep your results look something up quickly Run an occasional application Click on ‘Start’ paw on SRS start page

12 Some examples /^glu/ will find terms beginning with ‘glu’ /ase$/ will find terms ending with ‘ase’ /c.t/ will find the words cat, cot, cut……. /c.*t/ will find terms beginning with ‘c’ and then any number of characters and ending with ‘t’ /sm[iy]th/ will find the words ‘smith’ or ‘smyth’ /rho[1-9]/ will find the word ‘rho’ followed by a number from 1-9 /mue?ller/ will find ‘muller’ or ‘mueller’ NB. The ‘*’ symbol has two meanings: -within forward slashes ‘/’ it means the preceding group may be repeated zero or more times - outside forward slashes it means any character

13 SRS Query syntax SRS indexes database records using a ‘word by word’ approach. - DE Human glutathione transferase - The SRS description index will contain terms ‘human’, ‘glutathione’ and ‘transferase’.

14 Boolean operators (&) AND : ‘human & glutathione & transferase’ (|) OR : ‘human | glutathione | transferase’ (!) BUTNOT : ‘human ! glutathione ! transferase’

15 Wildcards These are useful when: Searching for a group of words (eg. Words starting ‘cell’ and ending ‘ase’ : cell*ase) If unclear about how a word is spelt in a database Two types: * one or more characters of any value ? Single character of any value Any number of wildcards can be placed anywhere in a search word Placing a wildcard at the start of a word or string may increase response time because all words in the index have to be checked against the string

16 Regular expressions

17 SRS Regular expressions NB: Must appear within forward slashes (/) Some operators: ^ marks the start of a string /^glu/ begins with ‘glu’ $ marks the end of a string /ase$/ ends with ‘ase’. dot is any single character […] characters in square brackets are regarded as a set, any of which can be matched [0-9] specifies a range of 1 to 9 * the preceding group may be repeated zero or more times + the preceding group may be repeated one or more times ? The preceding character/group occurs one or zero times


Download ppt "SEQUENCE RETRIEVAL SYSTEM SRS Tuomas Hätinen. Motivation Structural biology molecular biology genetics medicine Sequencing information physiology toxicology."

Similar presentations


Ads by Google