[Bejerano Fall09/10] 1 This Friday 10am Beckman B-200 Introduction to the UCSC Browser.

Slides:



Advertisements
Similar presentations
Mutations.
Advertisements

Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring) May occur in gametes.
[BejeranoWinter12/13] 1 MW 11:00-12:15 in Beckman B302 Prof: Gill Bejerano TAs: Jim Notwell & Harendra Guturu CS173 Lecture 12:
Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring, only to descendant cells)
Mutations 1.
[Bejerano Spr06/07] 1 TTh 11:00-12:15 in Clark S361 Profs: Serafim Batzoglou, Gill Bejerano TAs: George Asimenos, Cory McLean.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
DNA Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA.
[Bejerano Fall11/12] 1 Primer Friday 10am Beckman B-302 Introduction to the UCSC Browser.
CS273A Lecture 11: Comparative Genomics II
Genetic Mutations.
[BejeranoWinter12/13] 1 MW 11:00-12:15 in Beckman B302 Prof: Gill Bejerano TAs: Jim Notwell & Harendra Guturu CS173 Lecture 11:
8.7 – Mutations. Key Concept  Mutations are changes in DNA that may or may not affect phenotype. mutated base.
Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring) May occur in gametes.
HW # 80- Make cookies for the Cookie Mutation Lab Warm up What are the different types of mutations? How are mutations related to evolution? Place your.
Mutations. What Are Mutations?  A change in the structure or amount of an organisms genetic material  This mutation can be a tiny change in DNA structure.
DNA Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA.
Types of mutations Mutations are changes in the genetic material
Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring) May occur in gametes.
Mutations. What Are Mutations? Changes in the nucleotide sequence of DNA May occur in somatic cells (aren’t passed to offspring) May occur in gametes.
Mutations.
CS273A Lecture 15: Inferring Evolution: Chains & Nets II
Copyright Pearson Prentice Hall
Mutations SBI3U Ms. Lefebvre
Mutations.
Mutations.
Mutations.
Turner College & Career High School  2016
Mutations.
Mutations.
Mutations.
Human Mutations.
Mutations.
Mutations.
CS273A Lecture 12: Inferring Evolution: Chains & Nets
Mutations.
CS273A Lecture 14: Inferring Evolution: Chains & Nets
DNA and Mutations.
CS273A Lecture 8: Inferring Evolution: Chains & Nets
The Human Genome Source Code
Mutations.
Mutations.
Mutations.
Mutations.
Turner College & Career High School  2016
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
Mutations chapters 8 and 12
Bellwork How do we account for the wide variety of organisms that are on the Earth?
Mutations.
Mutations.
Mutations.
Mutations Good intro video
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
Mutations.
The Human Genome Source Code
Mutations.
Mutations chapters 8 and 12
Mutations.
Presentation transcript:

[Bejerano Fall09/10] 1 This Friday 10am Beckman B-200 Introduction to the UCSC Browser.

[Bejerano Fall09/10] 2 Lecture 6 Genome Evolution Chromosomal Mutations Paralogy & Orthology Chains & Nets

[Bejerano Fall09/10] 3 One Cell, One Genome, One Replication Every cell holds a copy of all its DNA = its genome. The human body is made of ~10 13 cells. All originate from a single cell through repeated cell divisions. cell genome = all DNA chicken ≈ copies (DNA) of egg (DNA) chicken egg cell division DNA strings = Chromosomes

Mutation Rate per bp per base pair per cell division This refers to mutations that are not repaired Thus, there are at least six new mutations in each kid that were not present in either parent Mutations range from the smallest possible (single base pair change) to the largest – whole genome duplication. Selection does not tolerate all of these mutation, but it sure does tolerate some. chicken egg chicken 4

5 Example: Human-Chimp Genomic Differences Number of events Nucleotide substitutions Indels < 10 Kb Microinversions < 100 Kb Deletions/Duplications Microinversions > 100 Kb Pericentric inversions Fusion 1% 3% Open question..

Chromosomal (ie Big) Mutations May Involve: –Changing the structure of a chromosome –The loss or gain of part of a chromosome

Chromosome Mutations Five types exist: –Deletion –Inversion –Translocation –Nondisjunction –Duplication

Deletion Due to breakage A piece of a chromosome is lost

Inversion Chromosome segment breaks off Segment flips around backwards Segment reattaches

Duplication Occurs when a gene sequence is repeated

Whole Genome Duplication at the Base of the Vertebrate Tree [Bejerano Fall09/10] 11 Xen.Laevis WGD

Translocation Involves two chromosomes that aren’t homologous Part of one chromosome is transferred to another chromosomes

Nondisjunction Failure of chromosomes to separate during meiosis Causes gamete to have too many or too few chromosomes Disorders: –Down Syndrome – three 21 st chromosomes –Turner Syndrome – single X chromosome –Klinefelter’s Syndrome – XXY chromosomes

Chromosome Mutation Animation

15 The Species Tree Sampled Genomes S S S Speciation

16 A Gene tree evolves with respect to a Species tree Species tree Gene tree Speciation Duplication Loss

[Bejerano Fall09/10] 17 Terminology Orthologs : Genes related via speciation (e.g. C,M,H3) Paralogs: Genes related through duplication (e.g. H1,H2,H3) Homologs: Genes that share a common origin (e.g. C,M,H1,H2,H3) Species tree Gene tree Speciation Duplication Loss single ancestral gene

[Bejerano Fall09/10] 18 Gene trees and even species trees are figments of our (scientific) imagination Species trees and gene trees can be wrong. All we really have are extant observations, and fossils. Species tree Gene tree Speciation Duplication Loss single ancestral gene Observed Inferred

Gene Families 19

Gu et al. Age distribution of human gene families shows significant roles of both large-scale and small-scale duplication in vertebrate evolution (2002) Nature Genetics 31; http://cs273a.stanford.edu [Bejerano Fall09/10]

21 Chaining Alignments Chaining highlights homologous regions between genomes (it bridges the gulf between syntenic blocks and base-by-base alignments. Local alignments tend to break at transposon insertions, inversions, duplications, etc. Global alignments tend to force non-homologous bases to align. Chaining is a rigorous way of joining together local alignments into larger structures.

[Bejerano Fall09/10] 22 “Raw” Blastz track (no longer displayed) Protease Regulatory Subunit 3 Alignment = homologous regions

Chains & Nets: How they’re built 1: Blastz one genome to another – Local alignment algorithm – Finds short blocks of similarity Hg18: AAAAAACCCCCAAAAA Mm8: AAAAAAGGGGG Hg AAAAAA Mm AAAAAA Hg CCCCC Mm CCCCC Hg AAAAA Mm AAAAA 23

Chains & Nets: How they’re built 2: “Chain” alignment blocks together – Links blocks that preserve order and orientation – Not single coverage in either species Hg18: AAAAAACCCCCAAAAA Mm8: AAAAAAGGGGGAAAAA Hg18: AAAAAACCCCCAAAAA Mm8 chains Mm Mm Mm Mm Mm

Another Chain Example ABC DE Ancestral Sequence ABC DE Human Sequence ABC DE Mouse Sequence B’ In Human Browser Implicit Human sequence Mouse chains B’ … … DE DE In Mouse Browser Implicit Mouse sequence Human chains … … DE 25

[Bejerano Fall09/10] 26 Chains join together related local alignments Protease Regulatory Subunit 3 likely ortholog likely paralogs

[Bejerano Fall09/10] 27 Chains a chain is a sequence of gapless aligned blocks, where there must be no overlaps of blocks' target or query coords within the chain. Within a chain, target and query coords are monotonically non- decreasing. (i.e. always increasing or flat) double-sided gaps are a new capability (blastz can't do that) that allow extremely long chains to be constructed. not just orthologs, but paralogs too, can result in good chains. but that's useful! chains should be symmetrical -- e.g. swap human-mouse -> mouse- human chains, and you should get approx. the same chains as if you chain swapped mouse-human blastz alignments. chained blastz alignments are not single-coverage in either target or query unless some subsequent filtering (like netting) is done. chain tracks can contain massive pileups when a piece of the target aligns well to many places in the query. Common causes of this include insufficient masking of repeats and high-copy-number genes (or paralogs). [Angie Hinrichs, UCSC wiki]

[Bejerano Fall09/10] 28 Before and After Chaining

[Bejerano Fall09/10] 29 Chaining Algorithm Input - blocks of gapless alignments from blastz Dynamic program based on the recurrence relationship: score(B i ) = max(score(B j ) + match(B i ) - gap(B i, B j )) Uses Miller’s KD-tree algorithm to minimize which parts of dynamic programming graph to traverse. Timing is O(N logN), where N is number of blocks (which is in hundreds of thousands) j<i

[Bejerano Fall09/10] 30 Netting Alignments Commonly multiple mouse alignments can be found for a particular human region, particularly for coding regions. Net finds best match mouse match for each human region. Highest scoring chains are used first. Lower scoring chains fill in gaps within chains inducing a natural hierarchy.

Chains & Nets: How they’re built 3: “Net” the chains heuristically to find “best guess” of orthologs – Pick highest-scoring chains that do not overlap chains already added to net – Single coverage in target (reference), not in query – Not symmetrical 31

[Bejerano Fall09/10] 32 Net Focuses on Ortholog

[Bejerano Fall09/10] 33 Nets a net is a hierarchical collection of chains, with the highest-scoring non-overlapping chains on top, and their gaps filled in where possible by lower-scoring chains, for several levels. a net is single-coverage for target but not for query. because it's single-coverage in the target, it's no longer symmetrical. the netter has two outputs, one of which we usually ignore: the target- centric net in query coordinates. The reciprocal best process uses that output: the query-referenced (but target-centric / target single- cov) net is turned back into component chains, and then those are netted to get single coverage in the query too; the two outputs of that netting are reciprocal-best in query and target coords. Reciprocal- best nets are symmetrical again. nets do a good job of filtering out massive pileups by collapsing them down to (usually) a single level. [Angie Hinrichs, UCSC wiki]

[Bejerano Fall09/10] 34 Before and After Netting