Bioinformatics for biomedicine Seminar: Sequence analysis of a favourite gene Lecture 5, 2006-10-17 Per Kraulis

Slides:



Advertisements
Similar presentations
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Advertisements

Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Basics of Comparative Genomics Dr G. P. S. Raghava.
DNA Chip Scanning DNA Chips are scanned with a laser to excite the fluorescein dye that is attached to the target cDNA. Only those probe spots where target.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Structural bioinformatics
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Protein Modules An Introduction to Bioinformatics.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
The Poor Beginners’ Guide to Bioinformatics. What we have – and don’t have... a computer connected to the Internet (incl. Web browser) a text editor (Notepad.
Topic 2 Adam Godzik. JCSG approach: no model archives, building models “on the fly”
Protein Structure Prediction II
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Protein Structures.
Bioinformatics for biomedicine More annotation, Gene Ontology and pathways Lecture 6, Per Kraulis
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Protein Tertiary Structure Prediction
Automatic methods for functional annotation of sequences Petri Törönen.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Protein domains. Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average.
Bioinformatics for biomedicine
1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Functional Annotation of Proteins via the CAFA Challenge Lee Tien Duncan Renfrow-Symon Shilpa Nadimpalli Mengfei Cao COMP150PBT | Fall 2010.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
BIOINFORMATIK I UEBUNG 2 mRNA processing.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Construction of Substitution Matrices
Bioinformatics for Human Biologists Rasmus Wernersson, Associate Professor Center for Biological Sequence Analysis, DTU [ -
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Motif discovery and Protein Databases Tutorial 5.
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. [many slides borrowed from various sources]
P HYLO P AT : AN UPDATED VERSION OF THE PHYLOGENETIC PATTERN DATABASE CONTAINS GENE NEIGHBORHOOD Presenter: Reihaneh Rabbany Presented in Bioinformatics.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Bioinformatics and Computational Biology
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]
Alignment & Secondary Structure You have learned about: Data & databases Tools Amino Acids Protein Structure Today we will discuss: Aligning sequences.
Mini course to understand the biological pathway Yong Jiang(Frank)
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
The Biologist’s Wishlist A complete and accurate set of all genes and their genomic positions A set of all the transcripts produced by each gene The location.
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
Basics of Comparative Genomics
Neurons and other cells express a conserved death program
How to use a bioinformatics website!
Predicting Active Site Residue Annotations in the Pfam Database
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Dr Tan Tin Wee Director Bioinformatics Centre
Identify D. melanogaster ortholog
Protein Structures.
Molecular Modeling By Rashmi Shrivastava Lecturer
Basics of Comparative Genomics
Problems from last section
Presentation transcript:

Bioinformatics for biomedicine Seminar: Sequence analysis of a favourite gene Lecture 5, Per Kraulis

Course design 1.What is bioinformatics? Basic databases and tools 2.Sequence searches: BLAST, FASTA 3.Multiple alignments, phylogenetic trees 4.Protein domains and 3D structure 5.Seminar: Sequence analysis of a favourite gene 6.Gene expression data, methods of analysis 7.Gene and protein annotation, Gene Ontology, pathways 8.Seminar: Further analysis of a favourite gene

From the previous lecture 3D structure –Relation to function –Methods of determination –Databases Domains –Definition –Relation to function –Databases

3D structure and sequence, 1 Sequence determines structure Anfinsen’s experiments: folding 3D structure prediction should work! –First principles (physics) Extremely difficult Only special cases –By similarity Works reasonably Depends on many factors

3D structure and sequence, 2 Similar sequence -> similar structure –General statement –No “real” counterexamples But a designed, extreme case exists A single 3D structure represents a family –Depending on sequence similarity Many sequences “allowed” for a structure

3D structure and sequence, 3 Many sequences “allowed” for a structure –Sequence divergence possible over time Structure more conserved that sequence –Sequence may diverge beyond recognition –Structure may still be similar –Apparently unrelated sequences may form similar structures!

3D structure and sequence, 5 Protein design –Target: A given structure –Specify a sequence that folds into it Some success in recent years –First principles approach Computational prediction –Evolutionary approach In vitro evolution experiments

3D structure and sequence, 4 Example: TIM barrel –TIM = Triose isomerase –Example in PDB: 1TRI –Symmetrical, 8 alpha-beta units –Enzymes Wide range of reactions, substrates Some TIM barrels have no significant sequence similarity –Sequence divergence? –Structure convergence?

Structure database PDB database –At RCSB: –Via UniProt: Many different 3D viewers –KiNG (uses Java) –RasMol (requires installation)

Unknown sequence? Given unknown sequence, what to do? –Set up a check list Never optimal Note: Unrealistic exercise –“Premier tool of analytic chemistry: the phone” –Consider known facts before analysis –Consider the underlying problem

Check list, outline Where to search first? –UniProt –Pfam –Ensembl Function? –Annotation –Similarities –Domains, structure

Unk1 File “unk1-seq.txt” Human protein No other info

Unk1 = Caspase 8 Protease –Cysteine peptidase C14 –Cascade, activation of others Hormone-triggered –TNF (tumor necrosis factor) Multicellular organisms –Apoptosis (programmed cell death) –Development, cancer

Unk2 File “unk2-seq.txt” From C. elegans –Result of genetic screen Orthologue in human?

Unk2 = nhr-25 Nuclear receptor (NR) family –Transcription factor –DNA-binding domain –Hormone-receptor domain Orphan NR –Unknown ligand; maybe none Multicellular organisms –Developmental processes Moulting in insects

Unk3 File “unk3-seq.txt” Mouse protein No other info

Unk3 = GPR141 GPCR –G-protein coupled receptor –7TM, seven transmembrane helices Orphan GPCR –Unknown ligand Human ortholog, Ensembl search – w/BLA_4CBTgZfhKhttp:// w/BLA_4CBTgZfhK

Task for this week Perform analysis of sequences – iomed/ iomed/ –unk4-seq.txt (Danio rerio, zebrafish) –unk5-seq.txt (human, fragment) Organism? Protein family? Structure and function? Human medical relevance?