Sequence Specific DNA Uptake Genetic exchange & bacterial evolution DNA uptake is primitive genetic exchange Some important human pathogens have DNA uptake.

2 Sequence Specific DNA Uptake Genetic exchange & bacterial evolution DNA uptake is primitive genetic exchange Some important human pathogens have DNA uptake systems –Haemophilus influenzae, Neisseria meningitidis, N. gonorrheorae, etc. H. influenzae and N. meningitidis prefer to uptake homologous DNA by recognizing uptake specific sequence (USS)

3 USS in H. influenzae 1.86 Mbp H. influenzae has 1471 copies of 29-base USS USS has 9-base oligo AAGTGCGGT 100% conserved 1471 copies: 100 times statistical average –Occupies 2.4% of whole genome Questions: –Why so many? –How did this evolve? –What is the cost?

4 68% (975/1471) of USS in 38% (656/1378) of genes. 433 genes has one USS, 152 has two, 56 has three, 8 has four, 6 has five, one (HI1685) has eight USS. –Focus on genes with single USS Distribution of USS in H. inf.

5 USS and UEP USS: 9-base oligo embedded in gene (DNA) UEP: when gene expressed into protein USS translated to 3- or 4-residue USS encoded peptide (amino acids) –3-residue UEPs: 60% (39/618) are TAL –4-residue UEPs: 63% (269/426) contain SAV

6 Methods for studying cost of embedding USS 1. Conservation of UEP sites in homologs –study conservation of UEP sites in host sequence and corresponding sites in homologs 2. Conservation of segment containing UEP –study conservation of segment containing UEP relative other segments within same protein

7 Conservation of UEP sites in homologs Compute matching scores between: –Query & matches; matches & matches –UEP & mi; mi & mj

8 BLAST search for homologs. First red line is query sequence, rest are matches. Present case has two high-similarity matches.

9 Each point: one UEP in one protein Ave. qm score (y) vs. mm score (x) (b) Ave. qm/QM (y) vs. mm/MM (x) score UEP sites in protein not less conserv- ed than cor- responding sites in homologs.

10 Segmentation of protein sequence containing UEP. XXX is position of UEP (USS encoded peptide). Conservation of segment containing UEP

11 Relative segment similarity scores in gene HI0027. Yellow bar is segment containing UEP.

12 Each point: one protein. Y: score of UEP containing sector. X: (a) ave. all sectors. (b) lowest quartile. (c) 3 rd quartile. (d) 1st quartile. Summary of relative sector scores for 473 proteins. UEP almost never in most conserved sites.

13 Summary UEP not more or less conserved in protein than corresponding sites in homologs of protein. –If less conserved, then would imply some disruption of protein, at some cost. –Result implies cost not detectable by method. Segment in protein containing UEP almost always in close to least conserved sections in protein. –Suggests UEP embedded in highly conserved section of protein eliminated by evolution –Explanation for result in first test Multiplication of USS did interact with evolution.

