Aligning Kinases Applying MSA Analysis to the CDK family.

Slides:



Advertisements
Similar presentations
Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Advertisements

Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
The European Bioinformatics Institute (EBI) Toolbox Julie Pellegrini Introduction to Bioinformatics.
Structural bioinformatics
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.
Introduction to BioInformatics GCB/CIS535
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
Evaluating alignments using motif detection Let’s evaluate alignments by searching for motifs If alignment X reveals more functional motifs than Y using.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
The Poor Beginners’ Guide to Bioinformatics. What we have – and don’t have... a computer connected to the Internet (incl. Web browser) a text editor (Notepad.
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
3D-COFFEE Mixing Sequences and Structures Cédric Notredame.
Multiple sequence alignment
Multiple Sequence Alignment
An Introduction to Multiple Sequence Alignments Cédric Notredame.
Protein Tertiary Structure Prediction
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
T.Jadczyk, Bioinformatics Applications in the Virtual Laboratory Bioinformatics Applications in the Virtual Laboratory Tomasz Jadczyk AGH University of.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
© Wiley Publishing All Rights Reserved. Building Multiple- Sequence Alignments.
Eric C. Rouchka, University of Louisville SATCHMO: sequence alignment and tree construction using hidden Markov models Edgar, R.C. and Sjolander, K. Bioinformatics.

ORDered ALignment Information Explorer. Alignment editor Conservation computtion “barcode” = schematic alignment Phylogenic tree 3D viewer => sequence.
Using the T-Coffee Multiple Sequence Alignment Package I - Overview Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program.
H-Invitational Database (H-InvDB) release 5.0, an integrated database of human genes and transcripts Released on 2007/12/26 Integrated database team Japan.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Protein and RNA Families
Classifying MSA Packages Multiple Sequence Alignments in the Genome Era Cédric Notredame Information Génétique et Structurale CNRS-Marseille, France.
Manually Adjusting Multiple Alignments Chris Wilton.
T-Coffee tutorial ACGT Retreat 2012 Jean-François Taly, Ionas Erb and Cedrik Magis.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Alignment & Secondary Structure You have learned about: Data & databases Tools Amino Acids Protein Structure Today we will discuss: Aligning sequences.
Finding new nirK genes in metagenomic data
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program.
Doug Raiford Lesson 5.  Dynamic programming methods  Needleman-Wunsch (global alignment)  Smith-Waterman (local alignment)  BLAST Fixed: best Linear:
Big Data Bioinformatics By: Khalifeh Al-Jadda. Is there any thing useful?!
Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique CN+LF An introduction to multiple alignments © Cédric Notredame.
Swiss Institute of Bioinformatics Institut Suisse de Bioinformatique LF Multiple alignments, PATTERNS, PSI-BLAST.
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Multiple Sequence Alignment Carlow IT Bioinformatics November 2006.
Protein Tertiary Structure Prediction Structural Bioinformatics.
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
T-COFFEE, a novel method for combining biological information Cédric Notredame.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Biology 224 Instructor: Tom Peavy October 18 & 20, Multiple Sequence.
Predicting Active Site Residue Annotations in the Pfam Database
Large Scale Annotation of Genomic Datasets with Genephony
There are four levels of structure in proteins
An Introduction to Multiple Sequence Alignments
BIOINFORMATICS Summary
By Stitziel, Tseng, Pervouchine, Goddeau, Kasif, Liang
Explore Evolution: Instrument for Analysis
Alternative Splicing May Not Be the Key to Proteome Complexity
Protein structure prediction.
What’s New At Hypercube
What’s New At Hypercube
Basic Local Alignment Search Tool
Presentation transcript:

Aligning Kinases Applying MSA Analysis to the CDK family

Building A Multiple Sequence Alignment

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: * chite AATAKQNYIRALQEYERNGG- wheat ANKLKGEYNKAIAAYNKGESA trybr AEKDKERYKREM mouse AKDDRIRYDNEMKSWEEQMAE * :.*. : Extrapolation Motifs/Patterns Phylogeny Profiles Struc. Prediction Multiple Alignments Are CENTRAL to MOST Bioinformatics Techniques. Potential Uses of A Multiple Sequence Alignment?

1 Organizing a Family Gathering The CDK example

Choosing the Right Sequences SwisProt Litterature Other Databases

Organizing the Data SRS Public Data IGS Data Aventis CDK Genecard Manual Automatic

Accessing the Data: The Fischer Server Fischer will Contain – A collection of Flat files – A secure SRS server – File Formats The server is a Technology Pipeline – Can be adapted in real time – Can be Transfered

Our CDK Data CDKs and CDK-like – Protein Information Functional Features Structural Information – Genomic Information Genes Variant SNPs

Our MSA dataset 29 amino acid sequences (CDKS and Aurora families, stemming from primary transcripts) – 2 isoforms of a cdk member 4 PDB structures : – 1MUO (AUR A) – 1BLX (CDK 6 ) – 1b38 (CDK 2) – 1H4L (CDK 5) Use of T-coffee release 1.78 with integration of the structure informations contained in pdb files

2 Aligning The Sequences

Building A Multiple Sequence Alignment ClustalW T-Coffee Muscle Hand Editing Combination Comparison

Using Structural Information 3D-Coffee Struct Vs Struct Seq Vs Struct Thread Superpose Seq Vs Seq Local Global

Method

Accessing the Methods: Fischer Public 3D-Coffee server – igs-server.cnrs-mrs.fr/TCoffee/ Fischer – Latest version of T-Coffee – Customised parameters – Coktails of MSA methods

3 Dressing Up a Multiple Sequence Alignment

Feature Dressing -25 Binding site -20 Phospho -40 nsSNP -50 Splice Site … Escript

Feature Dressing

4 How Good Is The Alignment ????

T-Coffee CORE Evaluation

CORE index Specificity (  ) and Sensitivity (  )

Feature Based Evaluation

Features mapping on multiple alignment T-coffee ATP binding site Glycine loop ATP binding site Glycine loop Non-synonymous SNP ClustalW

Structure Based Evaluation APDB

Include Sequences with Known Structures – Do Not use Structural Information Score 1 – Use Structural Information:Score 2 If Score1 ~ Score 2 – Structural Information does not help much – The alignment is of reasonnable quality

Evaluating a Multiple Sequence Alignment T-Coffee CORE index Feature Based Library APDB

Maninupulating and Comparing Alignments Reformating/Processing – seq_reformat – extract_from_pdb Coloring – seq_reformat – ESCript Comparing – aln_compare

5 Thinking Large ????

T-Coffee_dpa T-Coffee is limited to a small number of sequences T-coffee_dpa: Double Progressive Algo – Able to handle large datasets – 1000 sequences and more – Able to use structural information

Using A Multiple Sequence Alignment

1 Exploring The Alignment

Cdk's signature Cdk's T-loop (orange) and aurora's Activating loop Substrat recognition motif

2 Using The Alignment Does my Sequence Make Sense

Identifying Abnormalities within an MSA Insertion within the Nuc Binding Site…

Identifying Abnormalities within an MSA

Activation loop (orange)

Identifying Abnormalities within an MSA Retinoblastoma

2 Using The Alignment Analysing the Structure with The Alignment

The Evoltionnary Trace

3 Using The Alignment Spotting differences

What makes a CDK not and AurorA

4 Clustering and Correlating

Function Trees Vs Lead Trees 1-Select Functionnaly Important Positions 2-Make a tree based on these positions 3-Compare the tree with the lead tree PROBLEMS: – Choose on the right positions – Describe the Leads with the right determinants