Download presentation
Presentation is loading. Please wait.
1
Mapping Influenza A Virus Transmission Networks with Whole Genome Comparisons (Methods) Adrienne Breland TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
2
Goal - to characterize global Influenza A Virus transmission as a complex network TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
3
TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Russell (2008) The global circulation of seasonal influenza A (H3N2) viruses Proposed global H3N2 circulation
4
TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
5
Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline
6
Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline
7
Motivation Delineating real disease networks is difficult – Infection tracing: Detecting exact transmission links – Contact tracing: All potential transmission contacts – Diary Based: Subject records all contacts TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
8
Motivation TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Infection tracingContact tracingDiary Based Keeling M & K Eames (2005) Networks and epidemic models. J. R. Soc. Interface 2:295-307
9
Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
10
Motivation Delineating real disease networks is very useful -targeting an attack TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
11
Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási
12
Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT http://prblog.typepad.com/strategic_public_relation/images/2007/06/22/simple_soci al_network.png
13
Motivation Delineating real disease networks is very useful -correlation coefficients
14
Motivation Delineating real disease networks is very useful -detecting more probable global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
15
Motivation Global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
16
Motivation Global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Breland A, S Nasser, K Schlauch, M Nicolescu, F Harris (2008) Efficient Influenza A Virus Origin Detection. Journal of Electronics and Computer Science, 10;1-12
17
Motivation Delineating real disease networks is very useful -examine with other spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
18
Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
19
Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT VEGETATION
20
Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT POPULATION
21
Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT CLIMATE CHANGE
22
Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline
23
Major questions Location and degree of host jumping Underlying structure (small world, power law..) Subtype independence Re-assortment Geographic routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
24
Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline
25
Data http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
26
Data ≈ 4000 sequences 1999-2009 Global regions (i.e. China, U.S., Africa, India...) All subtypes (i.e. H5N1, H1N1,..) All hosts species (Domestic Avian, Wild Avian, etc..) TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
27
Data ≈ 374 per year TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
28
Data Multiple host types TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
29
Data Multiple sub types TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
30
Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline
31
Genome Comparisons Similarity matrix, N sequences: N(N-1)/2 comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT -0.40.10.970.10.82N --0.30.60.70.9. ---0.30.50.02. ---- 0.01. -----0.932 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21
32
Romanova,J (2006) The fight against new types of influenza virus. Biotechnology J, 1:1381-1392
33
Genome Comparisons 8 segments TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 HA ≈ 1750bp NS ≈ 900bp M ≈ 1000bpNA ≈ 1300bpNP ≈ 1500bp PA ≈ 2100bpPB1 ≈ 2200bpPB2 ≈ 2300bp
34
Genome Comparisons 8 segments TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT
35
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Alignment, O(n 2 ), n = max sequence length.....AAAACTTGAACC..........GGACTTGACCT.....
36
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Alignment-free k-mers, O(n) ∑ = {A,C,G,T/U} 4 k possible k-mers, k≥0 TT TG... AG AC AA frequencyk-word
37
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA Feature Frequency Profiles (FFP) C k = F k = = Sims GE, Jun SR, Wu GA, Kim SH (2009) Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc Natl Acad Sci U S A., 106(8):2677-82.
38
Genome Comparisons Jensen-Shannon Divergence (JS) compare(s 1,s 2 ) TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA P k = FFP(s 1 ), Q k = FFP(s 2 ), M k = (P k + M k )/2 JS(P k,M k ) = 1/2KL(P k,M k ) + 1/2KL(Q k,M k ) KL =
39
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA k=?
40
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA k=? k s.t. N(k) ≥ N(k+1) k ≈ 4
41
Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA Actual & Predicted times
42
Questions/Comments? Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.