Aidan Budd, EMBL Heidelberg Multiple Sequence Alignments
Aidan Budd, EMBL Heidelberg Build a Sequence Alignment Sequences are usually aligned automatically MUSCLE, PRANK, CLUSTAL etc. Also possible 'manually' using tools such as JalView Hopefully, these demonstrations will highlight that Alignment is "trivial" (at one level, at least) only involves putting gap characters in the right places
Aidan Budd, EMBL Heidelberg Build an Automatic MSA Search Internet for "EBI Muscle"
Aidan Budd, EMBL Heidelberg Build an Automatic MSA Copy and paste sequences in FASTA format Click "Submit"
Aidan Budd, EMBL Heidelberg Build an Automatic MSA Wait for result to be returned Click "Download Alignment File" to reach plain-text version of alignment
Aidan Budd, EMBL Heidelberg Build an Automatic MSA Download file or copy-paste text into text editor to store alignment on local computer View alignment in MSA viewer (e.g. JalView) etc.
Aidan Budd, EMBL Heidelberg Choosing an MSA tool CLUSTALX, MUSCLE, PROBCONS divergent protein sequences NAST multiple alignment of 16S rRNA genes PRANK multiple alignment of relatively similar DNA sequences in an evolutionary context EXPRESSO(3DCoffee) multiple alignment of protein sequences, some of which have 3D structural information MAUVE, Enredo multiple alignment of genomes and many others... Different tools designed for different tasks
Aidan Budd, EMBL Heidelberg Examining MSAs: Recognising Patterns
Aidan Budd, EMBL Heidelberg Only one of many possible colouring schemes Good at highlighting variation in conservation between Designed for red/green colour-blindness CLUSTALX Colouring Scheme extract from an alignment of p53 proteins
Aidan Budd, EMBL Heidelberg Amino acids with similar properties drawn with the same colour extract from an alignment of p53 proteins e.g. basic residues arginine (R) and lysine (K) CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelberg extract from an alignment of p53 proteins EXCEPT for P and G, which are always coloured Residues only coloured... e.g. lysine (K) in columns with: only "a few" other basic residues (uncoloured) "many" other basic residues (coloured)... if some proportion of residues in the column have the same property CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelberg Hydrophobic: L V I M F W A C Polar: N T S Q Acidic: D E Basic: K R Secondary-structure breaking: GP Large Aromatic Polar: H Y (CLUSTALX help file fully describes the default colouring rules) CLUSTALX Colouring Scheme
Aidan Budd, EMBL Heidelberg bin/homstrad/showpage.cgi?family=response_reg&disp=str Response regulator receiver domain Common Patterns - Buried Beta- Strand
Aidan Budd, EMBL Heidelberg bin/homstrad/showpage.cgi?family=response_reg&disp=str Response regulator receiver domain Common Patterns - Amphipathic Partially-Buried Alpha-Helices
Aidan Budd, EMBL Heidelberg ubiquitin conjugating enzyme Common Patterns - Amphipathic Beta Strands
Aidan Budd, EMBL Heidelberg Different, more strongly biased (from equal representation of each of the 20 amino acids), sequence composition Sometimes more variable sequence more substitutions more gaps than globular/structured regions) Common Patterns - Non-Globular Sequence
Aidan Budd, EMBL Heidelberg [RK].L.{0,1}[FYLIVMP] LIG_CYCLIN_1 Mostly occur in disordered protein regions Often show greater conservation than neighbouring sequence Common Patterns - Short Linear Motifs
Aidan Budd, EMBL Heidelberg K Identifying Mis-Aligned Regions Identify a region of a sequence that you think is misaligned Decide how you would "fix" this misalignment Look at patterns of conservation, and sequences which
Aidan Budd,EMBL Heidelberg Unusual Sequences: Examples With CLUSTALX “”Quality”->”Show Low-Scorring Segments” switched on Short/fragmented sequences Unusual pattern of "conservation"
Aidan Budd, EMBL Heidelberg Using MSAs to Improve Prediction of Linear Motifs Demonstration and Exercise