Download presentation
Presentation is loading. Please wait.
1
Sequence and structure databanks can be divided into many different categories. One of the most important is Supervised databanks with gatekeeper. Examples: Swissprot Refseq (at NCBI) Entries are checked for accuracy. + more reliable annotations -- frequently out of date Repositories without gatekeeper. Examples: GenBank EMBL TrEMBL Everything is accepted + everything is available -- many duplicates -- poor reliability of annotations
2
Description of Group B Streptococcus Pan-genome Genome comparisons of 8 closely related GBS strains Tettelin, Fraser et al., PNAS 2005 Sep 27;102(39)
3
Method
4
Bacterial Core Genes that are shared among all Bacteria Bit score cutoff 50.0 (~10E -4 ) f(x) = A 1 *exp (-K1*x) + A 2 *exp (-K2*x) + A 3 *exp (-K3*x) + Plateau
5
Genes without homologs f(x) = A 1 *exp (-K1*x) + A 2 *exp (-K2*x) + A 3 *exp (-K3*x) + A 4 *exp (-K4*x) + A 5 *exp (-K5*x) + Plateau
6
Decomposed function
7
Core Essential genes (Replication, energy, homeostasis) ~ 116 gene families Extended Core Set of genes that define groups or species (Symbiosis, photosynthesis) ~ 17,060 gene families Accessory Pool Genes that can be used to distinguish strains or serotypes (Mostly genes of unknown functions) ~ 114,800 gene Families uncovered so far
8
76.6% 3.8% 19.6% Gene frequency in individual genomes Core Extended Core Accessory Pool
9
f(x) = A 1* exp (-k1*x) + A 2* exp (-k2*x) + A 3* exp (-k3*x) + A 4* exp (-k4*x) + A 5* exp (-k5*x) + Plateau 1/k 1 = 0.48 1/k 2 = 2.3 1/k 3 = 10.16 1/k 4 = 31.40 1/k 5 = 162.6 A 1 = 939.4 A 2 = 731.1 A 3 = 455.2 A 4 = 328.6 A 5 = 385.5 Number of genomes added
10
Kézdy-Swinbourne Plot Novel genes after looking in x genomes Novel genes after looking in x + ∆x genomes ~230 novel genes per genome
11
A K é zdy-Swinbourne Plot plot can be used to estimate the value that a decay function approaches as time goes to infinity. Assume the simple decay function f(x) = K + A e -kx, then f(x + ∆x) = K + A e -k(x+∆x). Through elimination of A: f(x+∆x)=e -k ∆x f(x) + K ’ For the plot of f(x+∆x) against f(x) the slope is e -k ∆x. For x both f(x) and f(x+∆x) approach the same constant : f(x) K, f(x+∆x) K. (see the def. for the decay function) The K é zdy-Swinbourne Plot is rather insensitive to deviations from a simple single component decay function. More at Hiromi K: Kinetics of Fast Enzyme Reactions. New York: Halsted Press (Wiley); 1979
12
Kézdy-Swinbourne Plot Novel genes after looking in x genomes Novel genes after looking in x + ∆x genomes ~230 novel genes per genome
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.