Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
The Genome Every organism, including man, is specified by a genome. A genome is composed of DNA segments, called genes. Genes code for proteins
Bioinformatics is used for… Determining evolutionary relationships between organisms Looking at locations of particular genes Crop/Pharmaceutical Bioengineering Medical research
Bioinformatics is used for…. Making new drugs Crop bioengineering Gene sequencing Making phylogenetic trees
The Human Genome Project Began in the year 2000, published report in 2001 (Nature) Goal: to sequence all 20,000-25,000 genes on all the chromosomes in the human body New research is focusing on bioenergy— –developing plant feedstocks (fast-growing plants bred to produce electricity or liquid fuels) –using microorganisms (like bacteria) to break down cellulose in plant cell walls –converting sugars into biofuels.
Databases are… a storehouse of organized, indexed computerized data
Databases are used to … locate a gene within a sequence predict protein structure and/or function cluster protein sequences into families of related sequences view and analyze the data on millions of genomes
Step 1: Go to IMG Home Page Go to Find Genes
Step 2: Click on BLAST
Step 3: Copy and paste a nucleotide or protein sequence This is your query sequence
Step 4: BLAST sequence Take the top hit and click on it: this will give you a list of the genomes with the most similarity to your query sequence
Step 5: Click on the top hit
Step 6: Look at Gene Detail OID number Name of protein
Look at “gene neighborhood” Red shows query gene
Step 7:Click on IMG Genome BLAST
Step 8: Choose a Phylum or organism to BLAST against
Step 9: Set Maximum E-value Run BLAST The lower this number is, the more significant the alignment is between the query sequence and the other genomes you are comparing it against A low E Value shows genes with the most homology Run BLAST
Look at Genome Blast Results
Blast hits on a particular sequence
Look at: Alignment and E-values O= orthologs- genes from different species which are similar P= paralogs- genes from the same species which are similar
For a detailed view of alignment,click on Do Alignment: Letters denote amino acids; colors denote type of amino acid Dashed lines indicate deletions Look down this column- what do you notice?
Categories of amino acids
Homologs Gene sequences that are similar and show a close evolutionary relationship Two types of homologs: –Paralogs- similar gene sequences between members of the same species –Orthologs- similar gene sequences between 2 different species
10. Click on the hits with the best alignment
11. Add Selections to Gene Cart
12. Click: Show Gene Neighborhoods
13. Gene of interest is shown in red Look at colored genes next to gene of interest Gene of interest
14. To make a phylogenetic tree, go to Phylogeny.fr Copy and Paste amino acid sequence Click submit
15. Click Blast Explorer Paste sequence Click submit
A phylogenetic tree of cyanobacteria with homology for a particular gene sequence