Presentation is loading. Please wait.

Presentation is loading. Please wait.

[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.

Similar presentations


Presentation on theme: "[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean."— Presentation transcript:

1 http://cs273a.stanford.edu [Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean

2 http://cs273a.stanford.edu [Bejerano Aut08/09] 2 Lecture 7 Genes Paralogy & Orthology Chains & Nets

3 http://cs273a.stanford.edu [Bejerano Aut08/09] 3 Meet Your Genome Continues [Human Molecular Genetics, 3rd Edition]

4 http://cs273a.stanford.edu [Bejerano Aut08/09] 4

5 5 The Gene-ome makes < 2% of the H.G. [Human Molecular Genetics, 3rd Edition]

6 http://cs273a.stanford.edu [Bejerano Aut08/09] 6 Gene Finding – The Practice Challenge: “The genes, the whole genes, and nothing but the genes” Problems: spliced ESTs  legitimate gene isoform? predicting gene isoforms tissue/condition-specific genes / gene isoforms single exon genes pseudogenes Practice:

7 http://cs273a.stanford.edu [Bejerano Aut08/09] 7 The Human Gene Set [HGC, 2001]

8 http://cs273a.stanford.edu [Bejerano Aut08/09] 8 [Celera, 2001]

9 http://cs273a.stanford.edu [Bejerano Aut08/09] 9 wrong!

10 http://cs273a.stanford.edu [Bejerano Aut08/09] 10 Signal Transduction

11 http://cs273a.stanford.edu [Bejerano Aut08/09] 11 Ancient Origins of Important Gene Families

12 12 Multigene families due to:  Single gene duplication;  Segment duplication: Tandem duplication or duplication transposition  a b c d e f g  a b c d e f b c d g  Horizontal gene transfer;  Genome-wide doubling event

13 http://cs273a.stanford.edu [Bejerano Aut08/09] 13 Horizontal Gene Transfer

14 http://cs273a.stanford.edu [Bejerano Aut08/09] 14 Horizontal Gene Transfer in the H.G. [HGC, 2001] …

15 http://cs273a.stanford.edu [Bejerano Aut08/09] 15 Or is it? [Kurland et al., 2003]

16 http://cs273a.stanford.edu [Bejerano Aut08/09] 16 HGT between fish & their parasites

17 http://cs273a.stanford.edu [Bejerano Aut08/09] 17 Retroposed Genes and Pseudogenes

18 Chromosome Mutations May Involve: –Changing the structure of a chromosome –The loss or gain of part of a chromosome

19 Chromosome Mutations Five types exist: –Deletion –Inversion –Translocation –Nondisjunction –Duplication

20 Deletion Due to breakage A piece of a chromosome is lost

21 Inversion Chromosome segment breaks off Segment flips around backwards Segment reattaches

22 Duplication Occurs when a gene sequence is repeated

23 Translocation Involves two chromosomes that aren’t homologous Part of one chromosome is transferred to another chromosomes

24 Translocation

25 Nondisjunction Failure of chromosomes to separate during meiosis Causes gamete to have too many or too few chromosomes Disorders: –Down Syndrome – three 21 st chromosomes –Turner Syndrome – single X chromosome –Klinefelter’s Syndrome – XXY chromosomes

26

27 Chromosome Mutation Animation

28

29 http://cs273a.stanford.edu [Bejerano Aut08/09] 29 Chaining Alignments Chaining bridges the gulf between syntenic blocks and base-by- base alignments. Local alignments tend to break at transposon insertions, inversions, duplications, etc. Global alignments tend to force non-homologous bases to align. Chaining is a rigorous way of joining together local alignments into larger structures. [Jim Kent’s slides]

30 http://cs273a.stanford.edu [Bejerano Aut08/09] 30 Chains join together related local alignments Protease Regulatory Subunit 3

31 http://cs273a.stanford.edu [Bejerano Aut08/09] 31 Chains a chain is a sequence of gapless aligned blocks, where there must be no overlaps of blocks' target or query coords within the chain. Within a chain, target and query coords are monotonically non- decreasing. (i.e. always increasing or flat) double-sided gaps are a new capability (blastz can't do that) that allow extremely long chains to be constructed. not just orthologs, but paralogs too, can result in good chains. but that's useful! chains should be symmetrical -- e.g. swap human-mouse -> mouse- human chains, and you should get approx. the same chains as if you chain swapped mouse-human blastz alignments. chained blastz alignments are not single-coverage in either target or query unless some subsequent filtering (like netting) is done. chain tracks can contain massive pileups when a piece of the target aligns well to many places in the query. Common causes of this include insufficient masking of repeats and high-copy-number genes (or paralogs). [Angie Hinrichs, UCSC wiki]

32 http://cs273a.stanford.edu [Bejerano Aut08/09] 32 Affine penalties are too harsh for long gaps Log count of gaps vs. size of gaps in mouse/human alignment correlated with sizes of transposon relics. Affine gap scores model red/blue plots as straight lines.

33 http://cs273a.stanford.edu [Bejerano Aut08/09] 33 Before and After Chaining

34 http://cs273a.stanford.edu [Bejerano Aut08/09] 34 Chaining Algorithm Input - blocks of gapless alignments from blastz Dynamic program based on the recurrence relationship: score(B i ) = max(score(B j ) + match(B i ) - gap(B i, B j )) Uses Miller’s KD-tree algorithm to minimize which parts of dynamic programming graph to traverse. Timing is O(N logN), where N is number of blocks (which is in hundreds of thousands) j<i

35 http://cs273a.stanford.edu [Bejerano Aut08/09] 35 Netting Alignments Commonly multiple mouse alignments can be found for a particular human region, particularly for coding regions. Net finds best match mouse match for each human region. Highest scoring chains are used first. Lower scoring chains fill in gaps within chains inducing a natural hierarchy.

36 http://cs273a.stanford.edu [Bejerano Aut08/09] 36 Net Focuses on Ortholog

37 http://cs273a.stanford.edu [Bejerano Aut08/09] 37 Nets a net is a hierarchical collection of chains, with the highest-scoring non-overlapping chains on top, and their gaps filled in where possible by lower-scoring chains, for several levels. a net is single-coverage for target but not for query. because it's single-coverage in the target, it's no longer symmetrical. the netter has two outputs, one of which we usually ignore: the target- centric net in query coordinates. The reciprocal best process uses that output: the query-referenced (but target-centric / target single- cov) net is turned back into component chains, and then those are netted to get single coverage in the query too; the two outputs of that netting are reciprocal-best in query and target coords. Reciprocal- best nets are symmetrical again. nets do a good job of filtering out massive pileups by collapsing them down to (usually) a single level. [Angie Hinrichs, UCSC wiki]

38 http://cs273a.stanford.edu [Bejerano Aut08/09] 38 "LiftOver chains" are actually chains extracted from nets, or chains filtered by the netting process. [Angie Hinrichs, UCSC wiki]

39 http://cs273a.stanford.edu [Bejerano Aut08/09] 39 Before and After Netting


Download ppt "[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean."

Similar presentations


Ads by Google