Presentation is loading. Please wait.

Presentation is loading. Please wait.

Teacher’s Guide: Computer Lab on Bioinformatics Introduction This lab introduces you to the how and why of bioinformatics. You will learn how to use databases.

Similar presentations


Presentation on theme: "Teacher’s Guide: Computer Lab on Bioinformatics Introduction This lab introduces you to the how and why of bioinformatics. You will learn how to use databases."— Presentation transcript:

1 Teacher’s Guide: Computer Lab on Bioinformatics Introduction This lab introduces you to the how and why of bioinformatics. You will learn how to use databases to investigate particular nucleotide and protein sequences. Content standards: 1. Scientific progress is made by asking meaningful questions and conducting careful investigations. As a basis for understanding this concept and addressing the content in the other four strands, students should develop their own questions and perform investigations, students will: a. Select and use appropriate tools and technology (such as computer-linked probes, spreadsheets, and graphing calculators) to perform tests, collect data, analyze relationships, and display data. 4. Genes are a set of instructions encoded in the DNA sequence of each organism that specify the sequence of amino acids in proteins characteristic of that organism. As a basis for understanding this concept: a. Students know the general pathway by which ribosomes synthesize proteins, using tRNAs to translate genetic information in mRNA

2 Key Terms gene- sequence of DNA which codes for one protein query sequence genome-all the genes an organism has Nucleotide-DNA or RNA base, sugar and phosphate group Protein- polymer made of amino acids Database- computerized collection of data for retrieving or manipulating Gene neighborhood- all the genes in the physical vicinity of a query gene phylogenetic tree- a tree which shows genetic relatedness and evolutionary relationships Hits- data which match criteria you set Alignment- % of the gene in question which matches query gene Homology- how similar two sequences are; the amount of genetic relatedness between two genomes; high homology means common structure or common function or both E value- expectation value; the lower this is, the better the homology Paralogs- similar gene sequence between two organisms of the same species Orthologs- similar gene sequences between two organisms of different species

3 Opening questions: 1. What does DNA code for? proteins 2. What are the 4 nucleotides that make up DNA? A,T,C,G plus deoxyribose and a phosphate group (dAMP, dTMP, dCMP, dGMP) 3. What are the 4 nucleotides that make up RNA? A,U,C,G plus ribose and a phosphate group (rAMP, rUMP, rCMP, rGMP) 4. Where is DNA in bacteria located? inside plasmids or main circular chromosome of genomic DNA 5. How does DNA relate to traits? DNA codes for proteins which are the basis of traits, such as visible things like hair color, and non-visible things like enzymes and hormones. 6. What would happen to a bacterial cell if a gene for an enzyme became mutated? It may or may not be able to function properly. If the mutation does not affect the structure or function in a major way, homeostasis is not in jeopardy. If the mutation does cause a vital change in the protein, it could cause death or reduced fitness to the organism. 7. How are genes, proteins, and amino acids related to each other? Write one or two sentences showing the connection Genes code for proteins, which are, in turn, made of a string of amino acids.

4 Go to this site: http://bioinformatics.org/faqhttp://bioinformatics.org/faq 1. According to this website, what kind of science is bioinformatics? a biological and computer science 3. Look at the careers link. What kind of courses does the author recommend for someone who wants to “get involved” in bioinformatics? Math, computer programming, biochemistry, evolutionary biology 4. Go to www.dnai.orgwww.dnai.org a. Click on Timeline: Find what Kary Mullis invented in 1979 that gave him the Nobel prize? PCR What does this piece of technology do? PCR amplifies a segment of DNA, making millions of copies of a particular sequence, using forward and reverse primers b. Click on “Genome” and “The Project” On the menu tab, at top of that page, click on The Problem: What was the problem? How do we map a genome containing billions of base pairs? Click on “The Project” and “Pieces of the Puzzle.” View any two puzzle pieces, Describe a major contribution of The Human Genome Project to the area of bioinformatics? Bacterial artificial chromosomes cloned with human sequences of 150,000 base pairs in length, the building of a mathematical model of how a gene looks Go to http://blast.ncbi.nlm.nih.gov/Blast.cgi What does BLAST stand for? Basic Local Alignment SearchTool The BLAST site is maintained by what agency? NIH Take the BLAST tutorial at http://www.digitalworldbiology.com/BLAST/index.htmlhttp://www.digitalworldbiology.com/BLAST/index.html

5 Part 1: Go to IMG Home Ribulose bisphosphate carboxylase Is this a gene, a protein,an amino acid, or a nucleotide? A protein Look at the first hit. –Gene OID:637011688637011688 –Name of genome that has this sequence: Synechocystis sp. PCC 6803Synechocystis sp. PCC 6803 –Number of amino acids in length: 113 amino acids Taxonomy (Organism) Information: –Type of organism (Domain): cyanobacteria –Cell shape: coccus-shaped –What does “mesophile” mean? An organism that survives best in moderate temperatures: 15-40 o C Scroll down to Gene Neighborhood. Switch Ribulose bisphosphate carboxylase- small subunit to Ribulose bisphosphate carboxylase large subunit- by clicking on the large gene 2 genes upstream of the small unit. Click on this. It should now be red. Click IMG Genome BLAST. Blast this sequence against all the Cyanobacteria species. What is the E value set at? Which is better- a higher lower E value? 1x 10 -2 Why? The lower the E value, the more the homology, the smaller the number of hits one could expect to see by chance. The genomes in red are called “hits.” What are “hits?” data which meets the criteria you set Look at column marked “T” (Type of Homolog) Select all the genomes marked “O.” What are orthologs? Orthologs are homologous sequences between the query sequence and a genome from another species. Which genome is the top ortholog in the list? Cyanothece sp. ATCC 51142 What is the percent identity (What % of the query matched exactly with the database?)? 96.18%Cyanothece sp. ATCC 51142 Select all orthologs with a low E value and high % of alignment and add to gene cart.

6 Click on: “Look at the gene neighborhood” Identify 2 genes which are commonly seen in the near vicinity of ribulose disphosphate- Rbc small subunit, carboxysome shell peptide, chl B What is the significance of genes which are in the same neighborhood as your RuBisCO gene? Their close proximity may be related to chemical reactions that take place in similar time and space. Go to: Do Clustal Alignment: –What does each row of letters represent? One amino acid sequence (coded by one gene) –What does each letter represent? An amino acid –What does the color of each letter represent? Type of amino acid (polar, acidic, etc) –Look at the ends of the sequences. What do the dashed lines represent? deletions. –If each row represents one gene, why are there so many differences between samples of the same gene? Genes vary between individuals of the same species. Mutations occur– not all at the same time and place –Looking at all the amino acids at residue #20 (going from far left to right), from the top to the bottom-most genome, what do you notice about the amino acids? There is a lot of variation at this amino acid residue. This part of the amino acid sequence is not well-conserved; having the same amino acid at this position is not imperative for the function of the protein.

7 Phylogenetic Tree: Ribulose disphosphate- large subunit

8 Analysis of Phylogenetic Tree: Look at your phylogram Which represents the ancestral sequences, the nodes or the branches? The nodes What does a short distance between two groups indicate? There is a close evolutionary relationship and genetic relatedness. Chlorella and Volvox are in this phylogenetic tree along with cyanobacteria. How is this significant? How can you explain the close genetic similarity of this gene in both domains (Hint: think about the theory of endosymbiosis or horizontal gene transfer) Chlorella and Volvox are both eukaryotic algae (Domain Eukarya), while cyanobacteria are bacteria (Domain Bacteria). So, these organisms are in different domains. Both algae and cyanobacteria undergo photosynthesis, which use Rubisco carboxylase to convert RuBP and carbon dioxide into 3-PGA in the Calvin Cycle It is thought that algae acquired chloroplasts through endosymbiosis. According to this theory, chloroplasts, which contain DNA, were once bacteria. These bacteria were taken up by eukaryotic cells and later became organelles (chloroplasts and mitochondria). Other ways that genes “cross over” between different species is through horizontal gene transfer. Examples of this includes bacterial conjugation, viral transduction, and bacterial transformation.

9 The Calvin Cycle


Download ppt "Teacher’s Guide: Computer Lab on Bioinformatics Introduction This lab introduces you to the how and why of bioinformatics. You will learn how to use databases."

Similar presentations


Ads by Google