Bacterial Genetics - Assignment and Genomics Exercise: Aims –To provide an overview of the development and future development of our understanding of microbial genetic systems and their evolution –To initiate a review of the information in the course and bring topics together –To provide a chance for you to practice essay writing with assistance in planning and construction –To produce a short report on an on-line sequence search Your objective –To write an essay on time with sequence search report –To discuss with your group the seminar topics –To prepare one OH sheet the topics for discussion –To gain some help in essay writing and planning
Essay Title "The expansion of microbial genome and metagenomic data in recent years has had a profound impact on our understanding of microorganisms and their interactions with their environment" Deadline; LAST DAY OF THIS TERM Planning tutorial to be arranged 4 to 5 pages of A4 with references cited SEMINAR TOPIC 4 groups to meet nearer the time to prepare a list: Task: to prepare FIVE research applications/experiments arising from genome sequencing that should be pursued in the area of microbiology and the environment in the next 5 to 10 years Prioritise and discuss in final seminar One page report on sequence searches from on - line exercise Add one page of notes on discussion at seminar
Bacterial Genetics - Assignment and Genomics Exercises The BLAST programs (Basic Local Alignment Search Tools) A set of sequence comparison algorithms introduced in 1990 that are used to search sequence databases for optimal local alignments to a query.
. Program Description blastpCompares an amino acid query sequence against a protein sequence database. blastnCompares a nucleotide query sequence against a nucleotide sequence database. blastxCompares a nucleotide query sequence translated in all reading frames against a protein sequence database. You could use this option to find potential translation products of an unknown nucleotide sequence. tblastnCompares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tblastxCompares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. Please note that the tblastx program cannot be used with the nr database on the BLAST Web page because it is computationally intensive.
Calculating alignment scores. The raw score S for an alignment is calculated by summing the scores for each aligned position and the scores for gaps. In this figure, a DNA alignment is shown. In amino acid alignments, the score for an identity or a substitution is given by the specified substitution matrix (e.g. BLOSUM62). BLAST 2.0 and PSI- BLAST use "affine gap costs" which charge the score -a for the existence of a gap, and the score -b for each residue in the gap. A gap of k residues therefore receives a total score of -(a+bk) and a gap of length 1 receives the score -(a+b). Gap creation and extension variables a and b are inherent to the scoring system in use (BLAST 2.0 defaults).
P value The probability of an alignment occurring with the score in question or better. The p value is calculated by relating the observed alignment score, S, to the expected distribution of HSP scores from comparisons of random sequences of the same length and composition as the query to the database. The most highly significant P values will be those close to 0. P values and E values are different ways of representing the significance of the alignment. E value Expectation value. The number of different alignments with scores equivalent to or better than S that are expected to occur in a database search by chance. The lower the E value, the more significant the score.
Lambda Ratio To convert a raw score S into a normalized score S' expressed in bits, one uses the formula S' = (lambda*S - ln K)/(ln 2), where lambda and K are parameters dependent upon the scoring system (substitution matrix and gap costs) employed [7-9]. For determining S', the more important of these parameters is lambda. The "lambda ratio" quoted here is the ratio of the lambda for the given scoring system to that for one using the same substitution scores, but with infinite gap costs [8]. This ratio indicates what proportion of information in an ungapped alignment must be sacrificed in the hope of improving its score through extension using gaps. We have found empirically that the most effective gap costs tend to be those with lambda ratios in the range 0.8 to 0.9. K A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for search space size. The value K is used in converting a raw score (S) to a bit score (S'). lambda A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for scoring system. The value lambda is used in converting a raw score (S) to a bit score (S').
3. dbj|BAA29916| (AP000003) 170aa long hypothetical protein [P e-23dbj|BAA29916| sp|Q57951|Y531_METJA HYPOTHETICAL PROTEIN MJ0531 >gi| e-18sp|Q57951|Y531_METJA91 5. gi| (AE000872) conserved protein [Methanobacterium t e-16gi| gi| (AE000865) conserved protein [Methanobacterium t e-15gi| gi| (AE000803) conserved protein [Methanobacterium t e-15gi| Sequences producing significant alignments:Score(bits)E Value 1.sp|Q57997|Y577_METJA PROTEIN MJ0577 >gi| |pir||A e-85sp|Q57997|Y577_METJA314 2.pdb|1MJH| Structure-Based Assignment Of The Biochemical F e-72pdb|1MJH| 272