Marie-Adèle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

Users can now register interest in genes Will receive updates on knockout strain production mousephenotype.org The IMPC home page that provides access.
EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
Genome Presentation Schizosaccharomyces Pombe Anita Kim BME 088a - Surfing your Genome Prof. Todd Lowe February 20, 2003.
Genome Browsers Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Pathogen Informatics 26 th Nov 2013 Pathogen Sequencing Informatics Jacqui Keane Pathogen Informatics Wellcome Trust Sanger Institute Hinxton, Cambridge,
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Wellcome Trust Workshop Working with Pathogen Genomes Module 1 Artemis.
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
WFleaBase Daphnia Genome Database from Common Components Daphnia Genomic Consortium Meeting, Sept Don Gilbert,
Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored.
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
Using The Gene Ontology: Gene Product Annotation.
Kerstin Howe, Mario Caccamo, Ian Sealy The Zebrafish Genome Sequencing Project Bioinformatics resources.
Bsubt.embl complete entry in EMBL format (DNA and Features) bsubt.embl.Z bsubt.fasta complete DNA sequence in Fasta format bsubt.fasta.Z bsubt.con construct.
Tomato Chromosome 4: A Mapping & Sequencing Update 28 th September 2005 Christine Nicholson Mapping Core Group Welcome Trust Sanger Institute, UK.
05/04/2005 Informatics Meeting C. elegans – “Back To The Future”. Paul Davis (aka Huey)
EBI is an Outstation of the European Molecular Biology Laboratory. Bert Overduin Daniel Rios Stephen Fitzgerald Edinburgh, 24 & 25 February 2009 Ensembl.
Christian M Zmasek, PhD Burnham Institute for Medical Research Bioinformatics and Systems Biology
BioHealthBase: The Bioinformatics Resource Center for Francisella tularensis Shubhada Godbole 1, Stephen M. Beckstrom-Sternberg 2,3, Paul S. Keim 2,3,
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact e-Science.
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact
BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem.
MMAP: mouse Metabolomics Analysis Platform Preeti Bais 09/09/2014.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
NCBI Vector-Parasite Genomic Related Databases Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 12, 2004
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
ChEMBL– Open Access Database For Drug Discovery By – Udghosh Singh M.S.(Pharm), 3 rd Sem Pharmacoinformatics.
COSMIC GBrowse Visualising cancer mutations in genomic context Dave Beare Cancer Genome Project Wellcome Trust Sanger Institute, Hinxton,
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Naming & shaming & sequencing the ‘Apicomplexans’
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Linking Animal Models and Human Diseases Supported by NIH P41 HG002659, U54 HG004028, & R01 HG Cambridge University & the University of Oregon.
Pfam, DAS and the future Rob Finn DAS Workshop 2009.
Data provenance in biomedical discovery Donald Dunbar Queen’s Medical Research Institute University of Edinburgh Workshop on Principles of Provenance in.
Wellcome Trust Sanger Institute Informatics Systems Group Ensembl Compute Grid issues James Cuff Informatics Systems Group Wellcome Trust Sanger Institute.
The Public Face of TAIR User Interface Design Responsiveness to User Input.
EnsMart: A Generic System for Fast and Flexible Access to Biological Data Arek Kasprzyk et al (2004) 14: , Genome research EBI, Wellcome Trust.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Phylogenetic prediction of gene function Daniel Barker Centre for Evolution, Genes and Genomics, School of Biology, University of St Andrews
Pathogenomics How this project began: Ann Rose - take advantage of DNA sequence information - genomics Julian Davies - use the information to understand.
WTSI Mouse Genetics Programme CASIMIR Meeting, July 2007.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
GeneDB: A database for Prokaryotic and Eukaryotic Organisms Pathogen Sequencing Unit The Wellcome Trust Sanger Institute.
Data Integration & Data Mining Tool Donald Dunbar BHF CoRE Bioinformatics Team Edinburgh Bioinformatics Meeting April 2013.
Royal Botanic Garden Edinburgh Funded mostly by Scottish Government Martin Pullan – Biodiversity informatics David Harris – Herbarium Curator.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
IMDB: A Generic Insertional Mutagenesis Database Xiaokang Pan and Lincoln Stein Cold Spring Harbor Laboratory.
1 of 28 Evaluating Genes and Transcripts (“Genebuild”)
Sequence Curation Paul Davis Sanger Institute. Overview Sequence curation within WormBase consortium. Import of sequence data. Prediction stats. Work.
What is BLAST? Basic BLAST search What is BLAST?
26 th July 2006 Christine Nicholson, Mapping Core Group Karen McLaren, Finishing Group Leader Wellcome Trust Sanger Institute Sequencing the Gene Space.
Biojava org.biojava.bio org.biojava.bio.dist org.biojava.bio.dp org.biojava.bio.dp.onehead org.biojava.bio.dp.twohead org.biojava.bio.gui org.biojava.bio.gui.sequence.
FTEs not shown. Information subject to change and is indicative rather than comprehensive April 2010 Head of School Christopher Whitehead Physical and.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
and Comparative Genomics
Annotating with GO: an overview
The Ensembl Database Steven Jones August 18, 2004
Bioinformatics Tools for Comparative Genomics of Vectors
Department of Genetics • Stanford University School of Medicine
Shared Genomics Sharing paths of exploration to support collaborative reasoning in genomic data analysis David Hoyle, Mark.
Ensembl Genomes: Overview Poznań, 27th-28th June 2013
Ensembl Genomes: Overview Versailles, 12th-13th November 2012
by Arnab Pain, Hubert Renauld, Matthew Berriman, Lee Murphy, Corin A
Management Structure University Hospitals of Leicester NHS Trust
Presentation transcript:

Marie-Adèle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

The Sanger Institute  Principally funded by Wellcome Trust (about 96 %)  60,000,000 bases per day of raw data  600 employees  Sequencing of Human, Mice, Zebrafish & pathogen genomes  Manual and automatic genome annotation (Ensembl, Artemis)  Identification of cancer causing mutations (recently BRAF gene mutation)  Sequence variation and disease association

Sequencing  Small genomes (bacterial and model organisms)  projects  Current capacity 4 M reads p/a sufficient for 100 Mb of finished sequence  Mainly whole genome/chromosome shotguns including finishing  Many are international collaborations  Larger more complex genomes ( Mb) on the horizon Informatics  Automatic analysis  Manual annotation by expert biologists  Tools: finishing (Cyclops), annotation (Artemis), comparative analysis (ACT)  Data dissemination  Database resources Functional Genomics  S. pombe  Bacterial Genomes  D. discoideum The Pathogen Sequencing Unit

GeneDB

Project pages annotation sequences analysis GeneDB FTP site BLAST curation

What is GeneDB? a generic organism database annotated sequences as well as functional data visualisation in user-friendly environment annotation and analysis of data by biologists flexible enough to incorporate new data types linked to external databases fully curated

The GeneDB project Started in 2001 Funded by the Wellcome Trust for a period of 5 years Initially for 3 organisms: S. pombe, Leishmania & Trypanosome 2 full-time programmers, 1 part-time programmer One curator for each organism One helpdesk person / programmer Prototype now done and in use

Technical Outline Prototype “Java” biojava data gui minelet mining test utils web Web jspcgi blast ominblast aspcommon cerevisiae pombe malaria leish tryp Data asp images serialise indices cerevisiae images serialise indices pombe malaria tryp leish EMBL

Broad specifications for production version Relational database Curator / annotator interface incorporating functionality of Artemis (MESS) Facility for doing more complex queries For comprehensive, detailed specs see our Functional Specifications document

P. falciparum chr. 14

“biotin carboxylase” Inferred by Sequence Similarity with a yeast sequence SGD:S (which was originally annotated based on a published mutant phenotype)

Pathogen Sequencing Unit Analysis Martin Aslett Steven Bentley Matthew Berriman Ana Cerdeno Christiane Hertz-Fowler Matthew Holden Keith James Rachel Lyne Arnab Pain Chris Peacock Mohammed Sebaihia Nick Thomson Valerie Wood Project Management Bart Barrell Julian Parkhill Marie-Adele Rajandream Al Ivens Neil Hall Programming Rob Davies David Harper Arnaud Kerhornou Paul Mooney Kim Rutherford Adrian Tivey Ed Zuiderwijk Karen Mungall Theresa Feltwell Ian Goodhead Zahra Hance Heidi Hauser Mandy Sanders Mark Simmonds Danielle Walker Barbara Harris Becky Atkin Andrew Barron Carol Chillingworth Louise Clarke Craig Corton Jonathan Doggett Nicola Lennard Alexandra Line Doug Ormand David Harris Matthew Collins Nigel Fosker Arlette Goble Lee Murphy Susan O’Neil Simon Rutter David Saunders Kathy Seeger Robert Squares Steven Squares Carol Churcher Karen Brooks Inna Cherevach Tracey Chillingworth Kay Clarke Paul Davies Nancy Hamlin Kay Jagels Sharon Moule Brian White Sally Whitehead Subcloning Ann Cronin Audrey Fraser David Johnson Mike Quail Claire Price Ester Rabbinowitsch Sarah Sharp Mapping Maria Fookes John Woodward Sequencing Wellcome Trust Sanger Institute Administration Yvonne Shaw