Performing BlastP Amino acids Based on the nature of the side chains: Aliphatic amino acids- G, A, V, L, I, P Aromatic amino acids- F, Y, W Polar amino acids- S, T, N, Q Sulfur containing amino acids- C, M Charged amino acids- D, E, H, K, R Based on hydrophilicity: Hydrophilic- N, G, Q, R, H, K Hydrophobic- V, I, L, M, P Based on charge: Positively charged- K, R Negatively charged- D, E
Amino acidCodeAmino acidCode AlanineAla/AAspartic acidAsp/D PhenylalaninePhe/FHistidineHis/H LysineLys/KMethionineMet/M ProlinePro/PArginineArg/R ThreonineThr/TTryptophanTrp/W CysteineCys/CGlutamic acidGlu/E GlycineGly/GIsoleucineIle/I LeucineLeu/LAsparagineAsn/N GlutamineGln/QSerineSer/S ValineVal/VTyrosineTyr/Y Table: Amino acids and their three and single letter codes
Alignment of two closely related protein sequences such as human pancreatic ribonuclease (HPR) and bovine pancreatic ribonuclease (BPR) share a high degree of similarity Note: ‘+’ sign indicates a conservative replacement: a substitution by an amino acid with similar properties. For example, Serine (S) with threonine (T), Arginine (A) with Lysine (K) etc. KESRAKKFQRQHMDSDSSPSSSSTYCNQMMRRRNMTQGRCKPVNTF KETAAAKFERQHMDSS TSAASSSNYCNQMMKSRNL TKDRCKPVNTF HPR: BPR: BlastP: HPR and BPR
Analyzing BlastP output
BlastX (finding protein from a DNA sequence
Continued….
OMIM Online Mendelian Inheritance in Man Catalogous of all known diseases and its genetic association The information of this database was collected and processed under the leadership of Dr. McKusick at Johns Hopkins University Every disease and gene is assigned a six digit number of which the first digit number classifies the method of inheritance
First digit Range of MIM codeMethod of Inheritance Autosomal dominant loci Autosomal recessive loci – X-linked loci – Y- linked loci – Mitochondrial loci Autosomal loci The MIM code for the method of inheritance The output with asterisks (*) before an entry number indicate that the mode of inheritance is known The ouput with hash (#) before an entry number means that the phenotype can be caused by mutation in any of two or more genes
Leptin associated with Obesity is autosomal dominant
ORF (Open reading frame) The region of the nucleotide sequences from the start codon (ATG) to the stop codon (TAA, TGA, TAG) is called the Open Reading frame. Gene finding in organism specially prokaryotes starts form searching for an open reading frames (ORF). Eukaryotic gene finding is a different task as the eukaryotic genes are not continuous and interrupted by intervening noncoding sequences called ‘introns’. Depending on the starting point, there are six possible ways of translating any nucleotide sequence into amino acid sequence according to the genetic code. These are called reading frames.
Difference between CDS and ORF The Coding Sequence (CDS) is the actual region of DNA that is translated to form proteins. While the ORF may contain introns as well. In Prokaryotes the ORF and the CDS are the same.
ORF finder webpage
ORF output
Continued..
Read the gel to identify the sequence ddGTP ddATP ddTTP ddCTP The seq is – 5’- TAATGTACG -3’