You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you do next?
BLAST
Methods to look for similarity -Blast (and derivatives)/Blat -Zoo blots -Degenerate PCR against conserved regions -Functional complementation -Protein structure
BLAST WHAT?
-Which is more likely to be the same: a match of 10 amino acids or a match of 10 nucleotides? -4 possible bases vs. 20 amino acids. -amino acids more have more degeneracy. If we see a run of similar amino acids, it is less likely to have occurred by chance. Proteins or DNA?
What steps would you take to blast the amino acid sequence if you start with the nucleic acid sequence? -tblastn vs blastx?
BLASTP amino acid query sequence against a protein sequence database. BLASTN nucleotide query sequence against a nucleotide sequence database. BLASTX nucleotide query sequence translated in all reading frames against a protein sequence database. TBLASTN Compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. TBLASTX Compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. Computationally intensive. Program Description
What steps would you take to blast the amino acid sequence if you start with the nucleic acid sequence? -tblastn vs blastx -Look for ORF …how? When might you need to blast nucleic acid?
Homologs - sequences that have shared ancestry. Note homology is either true or not. There is no such thing as high homology or low homology. Usually inferred from significant similarity. DEFINITIONS Orthologs - homologs separated by a speciation event. May be functionally equivalent. Paralogs - homologs derived from a duplication event. Within same species…. I.e. Members of a human gene family share sequence similarity, but may have distinct functions.
-paralogy or orthology is inferred from significant sequence similarity. Examples: Mouse smad2 and frog smad2 are 98% identical (Orthologues) The Activin Receptor has many isoforms IA, IB IIA IIB, etc., that are very similar at the protein level. (Paralogues)
You blast the protein sequence..… and there is nothing like it in the database. Now what? motifs/domains
DOMAINS Definition: A contiguous segment of the primary sequence of a molecule that - in isolation- displays a significant property of the intact molecule. It is usually structurally stable and associated with a function, including providing a structural element to the protein.
There is similarity over certain regions to several molecules containing kinase domains. What does this tell you? - it’s a kinase! - location?
Clues to function: -what it interacts with -what its biology is known for.. -signaling pathway -is it like a molecule in a more tractable and studied system? What knowing domains gives:
Use of similarities Looking for new proteins with similarities to known proteins with interesting activities. Serotonin receptors, Tyrosine Kinases, Hedgehogs, TGF s, ….. Domain similarity RING fingers (E3 ligases), Kinase domains, DNA binding domains (bHLH Homeobox) Localization (TM domains, signal sequences, NLS, NES, signal peptide)
Nuclear localization signal (NLS) Stretch of basic residue: P-P-K-K-K-R-K-V Nuclear Export Sequence (NES) Hydrophobic helix: L-X(2,3)-[LIVFM]-X(2,3)-L-X-[LI] Signal peptide: Hydrophobic helix in the N-terminal region Often followed by a cleavage site for a protease Localization motifs nucleus cytoplasm Transmembrane or secreted
Enter your favorite protein here
Results