Positional cloning: the rest of the story a a a a a a a a X
Today: So you have a map location … now what? Mapped Mutant Cloned Gene
Mapping: Ultimate Goal X Screen MANY markers on FEW meioses LOW resolution = Potentially HIGH distance Great for “Which Marker is Linked?” Map Distance = # of recombinants # of meioses = 0 Screen NEARBY markers on MANY (1000’s) meioses HIGH resolution = Potentially ZERO distance Great for “Where is the Mutation?”
High-Resolution Mapping Basic strategies: more markers: Refine boundaries - SSLPs – likely polymorphic, no sequence needed - SNPs – require sequence data more mutants: Increase resolution One fancy strategy: NextGen sequencing of pooled WT and pooled mutants => RNA SEQ => focus on exons “Homozygosity Mapping”: Define homozygous region in mutants Find the actual mutation? How to be sure... ? Generate more SNPs = more markers to map on more mutants
Data so far: Mutant with defects in slow muscle specification Initial Mapping: Out of 16 meioses: 1 recombinants: Z3057, Z4999, Z recombinants: Z8693, Z recombinants: Z13936
From mutant map position to cloned gene Refining the map location with high-resolution mapping Trolling for candidate genes Testing candidates
From mutant map position to cloned gene Refining the map location with high-resolution mapping Trolling for candidate genes Testing candidates
2013
What’s near Z15270? Goal: obtain sequence so we can localize it to Genome
NCBI Nucleotide Query
Sequence Search at Ensembl Genome Browser Start close and move out both ways
Sequence Search at Ensembl Genome Browser Start close and move out both ways
Sequence Search at Ensembl Genome Browser Find More Markers To Test...
Find More Polymorphisms Find More Markers To Test...
Additional validated Polymorphisms
Simple Repeats: UCSC genome browser
Designing PCR primers
Where do we go from here? Can get sequences and test each of these Not all will be useful “Informative” = polymorphic = PCR amplicons of different lengths from WT and mutants Markers you’ve seen already
Testing for informative SSLPs “Informative” = polymorphic = PCR amplicons of different lengths from WT and mutants
More fish = refine the map More fish (i.e. embryos / larvae) = more recombinants = higher resolving power a a a a a a a a
Narrowing the critical interval More fish = more better 5/1156 Z /1156 Z11119
Z11119 Z15270 Defining the BOUNDARIES in the genome
Now what? Identify more markers and do more high-res mapping Key point = continually refine boundaries by recombination Look in genome for potential candidates What’s nearby in genome?... a [very good] MODEL of reality No luck in genome sequence? ( rare ) misassembly or gaps conserved synteny with other fish Physical map: BAC clones genetic or RH maps
Now what? Identify more markers and do more high-res mapping Key point = continually refine boundaries by recombination Look in genome for potential candidates What’s nearby in genome?... a [very good] MODEL of reality No luck in genome sequence? ( rare ) misassembly or gaps conserved synteny with other fish Physical map: BAC clones genetic or RH maps
What’s nearby in the genome?
Good candidate?
calca at ZFIN
calca expression motor neuron expression Mutant = lack slow muscle fibers what if... A secreted signal from motor neurons to developing muscle?!
calca expression: RNA-SEQ
What’s known about calca?
What’s known about calca? Cool new biology: it’s a secreted peptide with a novel role in directing slow muscle specification! Alert Cell, Science, and Nature!
How to test if this is the right gene?
Is calca the right gene? High resolution mapping - no recombinants between mutation and gene in lots of meioses Phenocopy with new mutant (or MO injection) or noncomplementation with another allele Rescue with mRNA injection Find mutation in coding sequence Picking the right strategy often is determined by balance of... - Available Resources - Number of Candidates These are often determined by size of candidate interval
Now what? Test potential candidates: Turn the candidate into a new map marker - could it be the right gene? - even if not, can it narrow your interval? How to turn it into a map marker? What’s a good candidate?
Now what? Test potential candidates: Turn the candidate into a new map marker - could it be the right gene? - even if not, can it narrow your interval? How to turn it into a map marker? What’s a good candidate?
Single nucleotide polymorphisms A G 200 bp 60 bp, 140bp Forward Reverse SNPs = ~ 1 / 250 bp in genome
Generating map markers from ESTs/Genes/other sequences Find or design primers for PCR (from gDNA) Sequence PCR product on WT and mut Find RE polymorphism or use your huge list of markers from nextGen sequencing pooled WT and pooled mutant. which regions are differentially homozygous?
Obtaining gDNA from cDNA sequence: exporting from genome
Obtaining gDNA from cDNA sequence: exporting from genome
BLAT Result
Good vs. Questionable Regions
Beware of shotgun (non-BAC, i.e. large clone) assembly Here there be Monsters Safe Sailing (mostly)
Obtaining gDNA from cDNA sequence: exporting from genome
Designing PCR primers
PCR primers Amplify from WT and mut, sequence...
Locating a SNP to map... run on your mapping panel - still a candidate? (0 recombinants) - narrow the candidate interval?
Identifying a restriction enzyme to map your SNP
dCAPS results
Striking the right balance in positional cloning Mapping : lots of fish, lots of PCR, lots of gels should always give you an unambiguous answer Functional : Sequencing => often done concomitantly with mapping mRNA rescue, CRISPR allele, Morpholinos => time, money Ambiguous, easy to make up lots of stories Follow-up: Map? Or Biology?
Mapping: Ultimate Goal X Screen MANY markers on FEW meioses LOW resolution = Potentially HIGH distance Great for “Which Marker is Linked?” Map Distance = # of recombinants # of meioses = 0 Screen NEARBY markers on MANY (1000’s) meioses HIGH resolution = Potentially ZERO distance Great for “Where is the Mutation?”
Mapping can do it all!
What if ZF genome turns out to be a dead end (RARE!)? Check other fish genomes - more candidate genes? - fix a gap in the ZF data RNA-SEQ or HMFSeq? Start a chromosome walk - iterative BAC screening
What if ZF genome turns out to be a dead end? Check other fish genomes Pufferfish (Tetraodon, Fugu) - smaller, more compact genome - good for getting enhancer regions
Tetraodon calca region More Candidates to test: find and map zebrafish orthologs
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 0) Virtual Positional Cloning 1) Navigate Genome browsers for information related to expression, Loss-of-function, Rescue 2) Zebrafish orthologs of your favorite human genes Identification of enhancer elements Transgenic Lines 3) BLAST on your own computer, and blast parsing via Python script 4) From transcriptome profiling, identify genes, download upstream sequences, visualize overrepresented motifs
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 0) Virtual Positional Cloning A review of what we did today, with some extra stuff
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 1)Navigate Genome browsers for information related to expression, Loss-of-function, Rescue Your favorite Zebrafish Gene => sequence / exon-intron boundaries, conservation expression morpholino design obtaining mRNA clones for rescue
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 2) Zebrafish orthologs of your favorite human genes Identification of enhancer elements Transgenic Lines Human gene ZF ortholog location in genome putative promoter / enhancer => conservation of noncoding DNA from other fish
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 3) BLAST on your own computer, and blast parsing via Python script All human proteins associated with HH signaling Identification of ALL putative ZF orthologs of these proteins via local BLAST Parse BLAST to get top result and genome location for each ZF protein Determine if genome location matches map position of mutant
Today: So you have a map location … now what? Mapped Mutant Cloned Gene Tomorrow’s bioinformatics practical: 4) From transcriptome profiling, identify genes, download upstream sequences, visualize overrepresented motifs List of short unidentified sequence Assign to Ensembl ID via BLAST and parsing Download 5’UTR and 2k upstream sequences for batch of Ensembl ID’s Search through these for enriched motifs Visualize locations of enriched motifs
Tomorrow’s Informatics Practical