Presentation is loading. Please wait.

Presentation is loading. Please wait.

Journal Club Jenny Gu October 24, 2006. Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies.

Similar presentations


Presentation on theme: "Journal Club Jenny Gu October 24, 2006. Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies."— Presentation transcript:

1 Journal Club Jenny Gu October 24, 2006

2 Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies of LUCA related to function and genome size. Challenged Woese’s Annealing hypothesis.

3 3-D Structural Comparison Domain Similarity Defined by: SSAP Dynamic Programming based Structure Comparison Algorithm CORA Comparison to 3D templates for each Superfamily. Manual Inspection. Profile based approaches Detect sequence patterns between relatives Functional Information Public resources (COGs, GO, KEGG) and literature Expect Curators Methods

4 Genome Structural Annotation and Occurrence Profiles Dataset: 114 complete genomes. 100 Prokaryotic Genomes 85 Bacteria, 15 Archeobacteria species 14 Eukaryotic Genomes Structural Annotation CATH HMMs -> Gene3D database. Superfamily Domain Occurrence Profiles (Prokaryotes) 940/1278 CATH domain present in at least one genome. Annotation Coverage: 50% of genes. Methods

5 Ancestral Superfamily Set Selection Defined by: Present in at least 90% of species from all kingdoms. Present in at least 70% archaeal and eukaryotic species. Definition avoids selection of superfamilies overrepresented in Bacteria but poorly represented in smaller groups. Flexibility for considering false-negative prediction error with sequence based approach. Guarantee selection of families in LUCA. Eliminate error introduced by horizontal gene transfer. Methods

6 Functional Annotation Automatic Functional Annotation for 940 structural superfamilies annotated in 100 prokaryotic species with COG. Superfamily functionally classified according to statistically most represented functional COG subcategory. 726/940 superfamilies annotated in COG (5% or more of species, at least 5 genes) For ancestral superfamily, further annotation with Pfam and literature. Methods

7 Definition of the Superfamily Functional Groups COG has six functional groups Translation Replication Metabolism Cellular Process Transcription Poorly Characterized Not considered RNA processing and modificaton Chromatin structure and dynamics Methods

8 Superfamily Functional Distribution in the Ancestral Domain Set 140 superfamilies found in all organisms of the three main kingdoms (Bacteria, Archaea, and Eukaryotes) 15% of Superfamilies, 55% of all domains in bacterial genes, and 18% of all domains in eukaryotes. Results and Discussion

9 Superfamily Functional Distribution in the Ancestral Domain Set (cont..) Representatives in all six COG functional groups. Translation (48 superfamilies) and Metabolic (46 superfamilies) comprise majority of ancestral domains. Metabolism (385 superfamilies) has undergone a higher expansion than translation (90 superfamilies). Results and Discussion

10 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Two issues in defining ancestry: Domain ubiquity through all species. Probable functions such domains could have performed in LUCA. Results and Discussion

11 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Results and Discussion

12 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Results and Discussion Interconversion of sugars and synthesis of polysaccharides. Synthesis of ATP and partial equilibrium of NAD/NADH Part of the Calvin Cycle Pentose phosphate pathway Acetyl-CoA for cholesterol and/or steroids and synthesis and degradation of fatty acids. Part of the Krebs Cycle

13 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Results and Discussion Nucleotide metabolism incomplete. Two alternatives for LUCA Synthesized nucleotides by de novo pathways Incorporated from surrounding soup. Enzyme for interconversion of nucleoside monophosphates are present.

14 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Results and Discussion DNA synthesis, repair, ligation, and modification are represented. Synthesis of RNA and DNA transcription represented. Domain related to robosomal partical and protein synthesis are abundant. Methyl Transfer Proteins

15 Analysis of the Cellular Functions of Ancestral CATH Superfamilies in the LUCA Results and Discussion Membrane and Cell wall biogenesis Transduction of protein-protein signals and gene regulation Protein signal recognitio for protein transport Cell division Electron transport And ATP synthase

16 Universal Distribution Percentage of Superfamilies Universal Distribution Percentages Superfamily occurrence profiles derived from the prokaryotic sample (Archaea and Bacteria) 100% = Superfamily present in all species. 0% = Superfamily has highly specific distribution in just a few species. Methods

17 Ancestry and Evolutionary Temperature Results and Discussion

18 Ancestry and Evolutionary Temperature Results and Discussion

19 Superfamily Duplication Rates and Functional Diversification Another measure to gauge evolutionary temperature. Number of homologues within a superfamily. Observed high correlation with duplication and functional diversification. Results and Discussion

20 Superfamily Duplication Rates and Functional Diversification High universality spans across more function subcategories. Metabolism has a higher duplication rate and functional diversification than translation. Results and Discussions

21 Genome Size Correlation and the Coefficient of Interspecies Gene Variation (CIGV) of Superfamilies Domain occurrence profiles from 100 prokaryotic sample. Correlation coefficients between occurrence and genome size. (compared to randomly generated null model.) CIGV calculated by dividing standard deviation over all values of occurrence profile for a given superfamily. Methods

22 Statistical Analysis of Superfamily Distributions Kolmogorov-Smirnov two-sample test in the two- tailed version for large samples. Compared pairs of distribution between different functional groups. Methods

23 Superfamily Occurrence Profiles and Genome Size Correlation Results and Discussions

24 Superfamily Occurrence Profiles and Genome Size Correlation Results and Discussions

25 Superfamily Occurrence Profiles and Genome Size Correlation Results and Discussions

26 Superfamily Coefficient of Interspecies Gene Variation Results and Discussions High CIGV values = more adaptable. Hotter evolutionary temperature Low CIGV values = less adaptable.

27 Superfamily Coefficient of Interspecies Gene Variation Results and Discussions

28 Rates of Superfamily Innovation in the Functional Groups Results and Discussions Poor Innovation High Innovation

29 Conclusions A more realistic distribution of superfamilies in distant species. Life achived modern cellular status long before separation of three kingdoms. Woese’s annealing hypothesis called into question. A function of specific features and adaptabilities versus time.


Download ppt "Journal Club Jenny Gu October 24, 2006. Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies."

Similar presentations


Ads by Google