PolyQ and neurodegenerative diseases

Slides:



Advertisements
Similar presentations
TYPES OF MUTATION CAUSING HUMAN GENETIC DISEASE Nucleotide substitutions (point mutations) Missense mutations Nonsense mutations Spice site mutations Frame.
Advertisements

Chapter 13.3 (Pgs ): Mutations
Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring, only to descendant cells)
SC.912.L.16.4 Explain how mutations in the DNA sequence may or may not result in phenotypic change. Explain how mutations in gametes may result in.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Prosite and UCSC Genome Browser Exercise 3. Protein motifs and Prosite.
Local Protein Unfolding and Pathogenesis of Polyglutamine-Expansion Disease Yu Wai Chen Centre for Protein Engineering and Cambridge University Chemical.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View.
What is a mutation? A mutation is a permanent change in the sequence of DNA.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
DNA Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Chapter 5 Genome Sequences and Gene Numbers. 5.1Introduction  Genome size vary from approximately 470 genes for Mycoplasma genitalium to 25,000 for human.
8.7 – Mutations. Key Concept  Mutations are changes in DNA that may or may not affect phenotype. mutated base.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
RNA and Protein Synthesis
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
Mutations.
Genetic Variation in Individuals and Populations: Mutation and Polymorphism Chapter 9 Thompson and Thompson (only mutation) Dr. M. Fardaei 1.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Because Stuff Happens Mutations.
Whole Genome Repeat Analysis Package A Preliminary Analysis of the Caenorhabditis elegans Genome Paul Poole.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Gene Regulations and Mutations
Sackler Medical School
Caption N-terminal domain A/B domain: ligand-independent domain AF1: Activation Function 1 DBD: DNA Binding Domain LBD: Ligand Binding Domain: ligand-dependent.
Motif discovery and Protein Databases Tutorial 5.
Labeling and Enhancing Life Science Links S. Heymann*, F. Naumann*, L. Raschid +, P. Rieger * * Humboldt Universität zu Berlin + University of Maryland.
DNA and Mutations. Mutation Facts #1-5 Write down five facts about mutations as we go through the following videoclips Video 1 Video 2 – Repair Video.
Bioinformatics and Computational Biology
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
DNA Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
The Future of Genetics Research Lesson 7. Human Genome Project 13 year project to sequence human genome and other species (fruit fly, mice yeast, nematodes,
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
InterPro Sandra Orchard.
 During replication (in DNA), an error may be made that causes changes in the mRNA and proteins made from that part of the DNA  These errors or changes.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Mutations. A permanent change in a cell’s DNA Rare Some cells have repair mechanisms to fix some damage.
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
KEY CONCEPT 8.5 Translation converts an mRNA message into a polypeptide, or protein.
Myotonic dystrophy DM Suhail Abdulla AlRukn
DNA and Mutations. 5 Mutation Facts Write down five facts about mutations as we go through the following videoclips Video 1 Video 2 – Repair Video 3 –
 DNA- genetic material of eukaryotes.  Are highly variable in size and complexity.  About 3.3 billion bp in humans.  Complexity- due to non coding.
What is a mutation?            A mutation is a permanent change in the sequence of DNA.
KEY AREA 4: Genes and Proteins in Health & Disease
Wild-type hemoglobin DNA Mutant hemoglobin DNA LE Wild-type hemoglobin DNA Mutant hemoglobin DNA 3¢ 5¢ 3¢ 5¢ mRNA mRNA 5¢ 3¢ 5¢ 3¢ Normal hemoglobin.
Copyright Pearson Prentice Hall
Mutations.
MUTATIONS.
Gene Regulation and Mutations
Large Scale Annotation of Genomic Datasets with Genephony
Dot Plots Dot Plots provide a graphic view of the amount of similarity between two sequences. The two axes represent the two sequences. In its simplest.
Mutations.
A ____________ is a change in an organism’s DNA.
4c. Know how mutations in the DNA sequence of a gene may or may not affect the expression of the gene or the sequence of amino acids in the encoded proteins.
A mutation is a change in an organism’s DNA.
MUTATIONS.
Mutations.
MUTATIONS.
Mutations.
Problems from last section
SRY Gene Testis-determining factor (TDF), also known as sex- determining region Y (SRY) protein, is a DNA-binding protein (also known as gene-regulatory.
Mutations.
Presentation transcript:

ProRepeat a comprehensive directory of exact tandem repeats in proteins

PolyQ and neurodegenerative diseases 9 diseases causes by polyQ repeats HD DRPLA SCA 1,2,3,6,7,17 Kennedy’s disease (SBMA)

Androgen receptor (AR) Transcription factor, mediating the effect of androgens on gene expression Gene located on the X chromosome, divided into three functional regions: one variable, the N-terminal transregulation domain (NTD) and two highly conserved, the DNA binding domain and the C-terminal ligand-binding domain (LBD) Involved in differentiation between male and female phenotype Responsible for SBMA (spinal and bulbar muscular atrophy) or Kennedy’s disease Céline Poux, RU

Androgen receptor (AR) Polymorphic polyQ repeat in the NTD ranges between 9 and 35 residues, with an average of 20 to 25 depending on ethnic origin Transcriptional activity depends on the intramolecular interaction between the NTD and the LBD inversely correlated with length and flexible structure of polyQ tract Differences in polyQ tract length can have important consequences longer tracts : feminization syndromes shorter tracts : prostate cancer susceptibility repeat exceeds 40 residues : SBMA NDT can contain other repeat tracts in mammals, such as polyP, polyG or polyQ Céline Poux, RU

Androgen receptor (AR) Transcription Factor HORMONE BINDING TRANSCRIPTIONAL REGULATION DNA BINDING NH3- -COOH T1 T2 T3 Region 1 Region 2 Region 3 polyQ tract length has important consequences ■ shorter tracts : prostate cancer susceptibility ■ longer tracts : feminization syndromes ■ over 40 residues : SBMA (spinal and bulbar muscular atrophy) or Kennedy’s disease 9-35 residues, average of 20-25 depending on ethnic origin I will present one of these genes: the Androgen receptor gene. We chose this gene because it was one of the two polyQ genes involve in a neurodegenerative disease for which the function of the protein was known. The Androgen receptor is a transcription factor that mediates the effect on genes expression of androgens. Its action is important in the differentiation process between male and female phenotype. This protein is encoded by a gene situated on the chromosome X and divided into three main regions: the first exon encodes for the transregulation domain, the DNA binding domain and the ligand biding domain. No deleterious mutation has been recorded in the first exon except the CAG repeats domain. The length of the repeat seems modulate the activity of the protein in such a way that the shorter the track is, the stronger the effect of the protein will be. When the number of repeats increases beyond the normal length, beside feminization problems, the individuals get a great chance to develop a Spinal and Bulbar Muscular Atrophy or Kennedy disease. Few poly amino acid repeats were already known in the first exon the polyQ responsible for the disease, a polyProline and a polyGlycine at the end of the exon. After sequencing of more than 30 mammals species it turned out that the Androgen receptor is a real “slippery protein” wit a lot of repeats often short and occurring in few species, like a polyAlanine repeat in the kangaroo.

PolyQ in AR Collection of polyQ repeats 792 human individuals available from earlier study (Edwards, 1992) 26 armadillo individuals sequenced by CP 77 mammals and marsupials from protein database Céline Poux, RU

What about repeats in other proteins? ProRepeat database Data sources: UniProt and RefSeq Limited to exact tandem repeats Standard, linear-time suffix tree algorithm Stored in Oracle 10g Interface in PHP5 unit length repetitions 1 ≥ 5 2 ≥ 4 3 ≥ 3 4 .. N ≥ 2 Maarten van den Bosch, WUR

DE is equivalent to ED; DEF is equivalent to EFD and FDE Simple query syntax: e.g. “Q” or “DE” DE is equivalent to ED; DEF is equivalent to EFD and FDE

Or use ProSite syntax: e.g. “[DE]-{P}-X(0,1).”

Taxonomic distributions of hits

Sorting/grouping options Identifier Repeat unit Repetitions Unit length Length Start location End location Protein Taxonomy Ontology

Link to DNA data DNA coding sequences of available repeats also stored in the database Extracted from EMBL and/or RefSeq Hong Luo, WUR

Link to DNA data / errors Approximately 3% of corresponding nucleotide sequences cannot be retrieved Errors caused by No links to nucleotide database (35%) NO_ANNOTATED_CDS No EMBL links Annotation errors in the nucleotide database (65%) Error type III: Join the complement of exon1, exon2, exon3 etc instead of complement the join

Guido Kappé, RU

T S Q G P A E

Verdeling van aminozuren in subgroepen van peptides binnen een proteoom. Bekeken zijn Single Amino Acid repeats (SAA), alle repeats minus de SAAs, en alle eiwitten (samenstelling) minus de repeats

Ter vergelijk, arabidopsis, waar Ser het meest abundant is in SAAs.

Current work Annotation of repeats versus function Adding imperfect tandem repeats - a.k.a. approximate tandem repeats (ATR) – to the database Offering remote access via web services (WSDL and BioMoby) Expansion of the analysis capabilities of the interface

PolyQ in AR (reprise) Impure tracts longer and more variable than pure CAG tracts (mainly CAA, CCG, and CGG) Presence of other codons better explained by codon duplication than multiple point mutations interrupting codons are part of elongation process, rather than hampering their dynamics as proposed previously Negative correlation between lengths of the different CAG tracts maximal expansion length that protein can handle without being deleterious Céline Poux, RU

Acknowledgements Wageningen University and Research Centre Maarten van den Bosch Hong Luo Mark Kramer Harm Nijveen Radboud University, Nijmegen Guido Kappé Céline Poux Wilfried W. de Jong This work was supported in part by project grants from NWO/BMI (GK, CP) and the NBIC/BioAssist program (HN)

Thank you for your attention! See also our posters on phylogenetic domain visualisation (TreeDomViewer) and microarray (re)annotation at the ISMB Post-doc positions available: contact Jack.Leunissen@wur.nl or jack@bioinformatics.nl