Evidence for Widespread Reticulate Evolution within Human Duplicons

Slides:



Advertisements
Similar presentations
A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea  Fernando L. Mendez, Joseph C.
Advertisements

Recurrent Reciprocal Genomic Rearrangements of 17q12 Are Associated with Renal Disease, Diabetes, and Epilepsy  Heather C. Mefford, Séverine Clauin, Andrew.
Xiaoshu Chen, Jianzhi Zhang  Cell Systems 
The Structure of Common Genetic Variation in United States Populations
Lisa Edelmann, Raj K. Pandita, Bernice E. Morrow 
Frequency of Nonallelic Homologous Recombination Is Correlated with Length of Homology: Evidence that Ectopic Synapsis Precedes Ectopic Crossing-Over 
Kendy K. Wong, Ronald J. deLeeuw, Nirpjit S. Dosanjh, Lindsey R
Accurate Molecular Classification of Renal Tumors Using MicroRNA Expression  Eddie Fridman, Zohar Dotan, Iris Barshack, Miriam Ben David, Avital Dov, Sarit.
Alternative Computational Analysis Shows No Evidence for Nucleosome Enrichment at Repetitive Sequences in Mammalian Spermatozoa  Hélène Royo, Michael Beda.
Novel PMS2 Pseudogenes Can Conceal Recessive Mutations Causing a Distinctive Childhood Cancer Syndrome  Michel De Vos, Bruce E. Hayward, Susan Picton,
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Resolving the Breakpoints of the 17q21
Genetic-Variation-Driven Gene-Expression Changes Highlight Genes with Important Functions for Kidney Disease  Yi-An Ko, Huiguang Yi, Chengxiang Qiu, Shizheng.
Volume 112, Issue 7, Pages (April 2017)
Clinical Relevance of Sensitive and Quantitative STAT3 Mutation Analysis Using Next- Generation Sequencing in T-Cell Large Granular Lymphocytic Leukemia 
Michael Cullen, Stephen P
Reciprocal Crossovers and a Positional Preference for Strand Exchange in Recombination Events Resulting in Deletion or Duplication of Chromosome 17p11.2 
Discovery and Characterization of piRNAs in the Human Fetal Ovary
Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets  Benjamin P. Lewis, Christopher B. Burge,
Robust Detection of DNA Hypermethylation of ZNF154 as a Pan-Cancer Locus with in Silico Modeling for Blood-Based Diagnostic Development  Gennady Margolin,
Recombination between Palindromes P5 and P1 on the Human Y Chromosome Causes Massive Deletions and Spermatogenic Failure  Sjoerd Repping, Helen Skaletsky,
Volume 146, Issue 6, Pages (September 2011)
Haplotype Estimation Using Sequencing Reads
Molecular Characterization and Gene Content of Breakpoint Boundaries in Patients with Neurofibromatosis Type 1 with 17q11.2 Microdeletions  Dieter E.
Genome-wide Transcriptome Profiling Reveals the Functional Impact of Rare De Novo and Recurrent CNVs in Autism Spectrum Disorders  Rui Luo, Stephan J.
David H. Spencer, Kerry L. Bubb, Maynard V. Olson 
Volume 20, Issue 12, Pages (June 2010)
Evolutionary Rewiring of Human Regulatory Networks by Waves of Genome Expansion  Davide Marnetto, Federica Mantica, Ivan Molineris, Elena Grassi, Igor.
Alternative Splicing QTLs in European and African Populations
Arpita Ghosh, Fei Zou, Fred A. Wright 
Allele-Specific Methylome and Transcriptome Analysis Reveals Widespread Imprinting in the Human Placenta  Hirotaka Hamada, Hiroaki Okae, Hidehiro Toh,
Myotonic Dystrophy Type 2: Human Founder Haplotype and Evolutionary Conservation of the Repeat Tract  Christina L. Liquori, Yoshio Ikeda, Marcy Weatherspoon,
Volume 133, Issue 3, Pages (May 2008)
Maternal History of Oceania from Complete mtDNA Genomes: Contrasting Ancient Diversity with Recent Homogenization Due to the Austronesian Expansion  Ana T.
The β-Globin Recombinational Hotspot Reduces the Effects of Strong Selection around HbC, a Recently Arisen Mutation Providing Resistance to Malaria  Elizabeth.
Towfique Raj, Manik Kuchroo, Joseph M
Kerrie Nichol Edamura, Michelle R. Leonard, Christopher E. Pearson 
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Volume 14, Issue 7, Pages (February 2016)
Recurrent Reciprocal Genomic Rearrangements of 17q12 Are Associated with Renal Disease, Diabetes, and Epilepsy  Heather C. Mefford, Séverine Clauin, Andrew.
Contrasting Effects of Natural Selection on Human and Chimpanzee CC Chemokine Receptor 5  Stephen Wooding, Anne C. Stone, Diane M. Dunn, Srinivas Mummidi,
Studying Gene and Gene-Environment Effects of Uncommon and Common Variants on Continuous Traits: A Marker-Set Approach Using Gene-Trait Similarity Regression 
Human Genomic Deletions Mediated by Recombination between Alu Elements
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Jeffrey A. Fawcett, Hideki Innan  Trends in Genetics 
High-Resolution Molecular Characterization of 15q11-q13 Rearrangements by Array Comparative Genomic Hybridization (Array CGH) with Detection of Gene Dosage 
Adaptive Evolution of UGT2B17 Copy-Number Variation
Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent  Sharon R. Browning, Brian L. Browning  The.
A DNA Replication Mechanism for Generating Nonrecurrent Rearrangements Associated with Genomic Disorders  Jennifer A. Lee, Claudia M.B. Carvalho, James.
Michael A. Rogers, Hermelita Winter, Christian Wolf, Jürgen Schweizer 
Complex Signatures of Natural Selection at the Duffy Blood Group Locus
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Reciprocal Crossovers and a Positional Preference for Strand Exchange in Recombination Events Resulting in Deletion or Duplication of Chromosome 17p11.2 
Volume 122, Issue 6, Pages (September 2005)
Complete Haplotype Sequence of the Human Immunoglobulin Heavy-Chain Variable, Diversity, and Joining Genes and Characterization of Allelic and Copy-Number.
Novel PMS2 Pseudogenes Can Conceal Recessive Mutations Causing a Distinctive Childhood Cancer Syndrome  Michel De Vos, Bruce E. Hayward, Susan Picton,
Volume 5, Issue 4, Pages (November 2013)
Characterization of New Members of the Human Type II Keratin Gene Family and a General Evaluation of the Keratin Gene Domain on Chromosome 12q13.13  Michael.
Gene Density, Transcription, and Insulators Contribute to the Partition of the Drosophila Genome into Physical Domains  Chunhui Hou, Li Li, Zhaohui S.
Volume 110, Issue 4, Pages (August 2002)
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Volume 21, Issue 23, Pages (December 2011)
Computation of the Internal Forces in Cilia: Application to Ciliary Motion, the Effects of Viscosity, and Cilia Interactions  Shay Gueron, Konstantin.
Xiaoshu Chen, Jianzhi Zhang  Cell Systems 
Matthew A. Saunders, Jeffrey M. Good, Elizabeth C. Lawrence, Robert E
A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea  Fernando L. Mendez, Joseph C.
Long noncoding RNAs are distributed throughout the genome and are characteristically distinct from annotated mRNAs. Long noncoding RNAs are distributed.
George D. Dickinson, Ian Parker  Biophysical Journal 
Bruce Rannala, Jeff P. Reeve  The American Journal of Human Genetics 
Presentation transcript:

Evidence for Widespread Reticulate Evolution within Human Duplicons Michael S. Jackson, Karen Oliver, Jane Loveland, Sean Humphray, Ian Dunham, Mariano Rocchi, Luigi Viggiano, Jonathan P. Park, Matthew E. Hurles, Mauro Santibanez-Koref  The American Journal of Human Genetics  Volume 77, Issue 5, Pages 824-840 (November 2005) DOI: 10.1086/497704 Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 1 Examples of reticulate and bimutational quartets. See description of quartet classification in the “Material and Methods” section. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 2 Estimate of reticulation-event density. A, Cladogram of c9orf36 alignment. Sequences are defined by their RPCI11 BAC clone names. The three partitions that support the tree (18, 19, and 1D) are indicated. B, Partition matrix of proximal 8.2 kb of c9orf36 alignment. Sites support 16 different partitions; the two sequence groups that define each partition are indicated by black and white circles above the matrix, and the partitions that support the tree are to the left of the vertical dashed line. Each informative position is represented by a separate row of squares (numbered on the right). The specific partition defined by each informative site is indicated by a white square containing a black dot. All partitions compatible with this partition are shown as white squares, and all partitions incompatible with it are shown in black. Positions that support alternative partitions are assumed to be the result of reticulation. The four reticulation events inferred from the data are numbered 1–4, and the maximal extents of the sequences affected are indicated by dashed horizontal lines. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 3 Identification of reticulation events by use of phylogenetic profiling. A, Control and observed profiles of 21-kb section of 15q25 alignment created using a window size of 30 parsimony-informative sites. The extent of gene-related sequences is indicated. The X-axis shows position within alignment (in kb); the Y-axis shows correlation. B, NJ trees generated using subalignments from regions 1 and 2. The clades indicated with an asterisk (*) are supported by bootstrap values of 99%–100%. The scale (F84 distance) is the same for both trees. All sequences are indicated by the last three digits of their accession numbers. Sequences included are AC044860, AC127482, AC135735, AC135995, AC005630, and AC010725. AC127482 contains two copies of the duplication, A and B. C, Schematic structure of both SMA alleles (Var1 and Var 2) adapted from Schmutz et al. (2004). The positions of the SMN1 and SMN2 genes are indicated. The extent of duplicated sequence is shown in gray, with the position of the most abundant duplicated segments (V1.1–V2.3) indicated. The gap in the sequences is represented by a pair of dashed lines. The scale is in megabases. D, Control and observed profiles spanning the ∼85-kb SMA-1 alignment, created using a window of 20 parsimony-informative sites. The X-axes show informative sites; the Y-axes show correlation. E, Parsimony networks of all six repeats within allele 1 (left) and all nine repeats within both alleles (right). Scale is in nucleotide differences. Sequences aligned (in order from V1.1 to V2.3) are AC138957, AC131392, AC138866, AC138959, AC138911, AC140139, AC139500, AC108108, and AC138930. Examples of alignments of informative sites used to generate the profiles are provided in figure 4. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 4 Examples of sequence alignments used to generate profiles. Partial sequence alignments stripped of all invariant and uninformative sites are shown, to highlight changes in phylogenetic signal within the profiles presented in figures 3 and 5. Each alignment is shaded with respect to a reference sequence shown in gray, with all identities to the sequence shown in black. A, 15q24. B, SMA-1. C, 22qter. D, chAB4-2 minima 1. E, chAB4-2 minima 2. F, 22q11.1. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 4 Examples of sequence alignments used to generate profiles. Partial sequence alignments stripped of all invariant and uninformative sites are shown, to highlight changes in phylogenetic signal within the profiles presented in figures 3 and 5. Each alignment is shaded with respect to a reference sequence shown in gray, with all identities to the sequence shown in black. A, 15q24. B, SMA-1. C, 22qter. D, chAB4-2 minima 1. E, chAB4-2 minima 2. F, 22q11.1. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 5 Reticulations identified by phylogenetic profiling. For ease of presentation, only parsimony-informative positions are plotted, with a window size of 80, 50, and 40 parsimony-informative positions in panels A, B, and C, respectively. The number of positions identical to a reference sequence within the alignment (used to calculate the correlation) is shown for both windows flanking the numbered minima. Thus, in panel A, AP006327 is identical to the reference at 66 of 80 parsimony-informative positions to the left of the minima at position 257, but only 15 of 80 sites to the right of the minima. All numbered minima are >2 times lower than any observed in the control profiles (not shown). The chromosome 22q11-1 alignment (C) has >7 minima that exceed this control threshold. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 6 Delineation of putative hotspot in SMA-1 region. Output from SimPlot (version 3.2) developed by S. Ray (Lole et al. 1999) shows identity of all sequences within the SMA1 alignment to AC138959 (500-bp window with a 20-bp step size). All nine sequences share ∼99.96% identity within the region of 59–67 kb. B, Detailed view, showing landmarks within the 56–70-kb region. The region of maximal identity between the sequences is defined by an L1PA3 fragment (∼58 kb) and a highly variable AT dinucleotide repeat (68 kb). A further L1PA3 repeat distal to this AT dinucleotide creates a flanking direct repeat (both LIPA3s span positions 5721–6155 of the consensus L1 sequence). The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 7 Quartet analysis of multiple alignments. A, Reticulate quartets in CpG-positive data expressed as a percentage of all informative quartets. B, Bimutational quartets in CpG-positive data expressed as a percentage of all informative quartets. The insert shows the same data at a higher resolution. C, Reticulate quartets in CpG-negative data expressed as a percentage of all informative quartets. Bars on observed data show 95% bootstrap values, and bars on simulated data show 95% CIs. In the 22q11.1 CpG-negative alignment, reticulate quartets represent >50% of all informative quartets. This is a result of low bootstrap values within the NJ tree. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 8 Reticulation in relation to sequence identity. Linear regression of log-transformed data is shown as a solid line (r2=0.599), and 95% CIs are shown as dashed lines. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 9 Tract length increase in duplicons. The ratio of observed to expected tract lengths is shown for control and duplicon alignments. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 10 Reticulation-event density in duplicons. Analysis of 11 alignments for which the expected frequency of reticulate quartets is negligible. All show a >20-fold excess of reticulate quartets relative to the expectation, with expected frequencies in 100 control alignments <0.5% of the observed value at the 50th percentile and <2.0% of the observed value at the 95th percentile. Analyses were performed on CpG-negative data, and the minimum number of events was estimated as shown in figure 2. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions

Figure 11 Distribution of sites indicating suboptimal trees. Partimatrix output shows parsimony-informative sites from the central region of the 15q25 alignment (11.0–33.12 kb). The three partitions that support the tree are to the left of the red line (7, 3, and 17), and the clustering of sites supporting each partition is indicated. If the tree is an accurate representation of the phylogenetic relationships, then the positions supporting each partition should be randomly distributed. For explanation of output, see figure 2 legend. The American Journal of Human Genetics 2005 77, 824-840DOI: (10.1086/497704) Copyright © 2005 The American Society of Human Genetics Terms and Conditions