Genome Browser The Plot Deepak Purushotham Hamid Reza Hassanzadeh Haozheng Tian Juliette Zerick Lavanya Rishishwar Piyush Ranjan Lu Wang.

Slides:



Advertisements
Similar presentations
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Advertisements

 To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
World Wide Web1 Applications World Wide Web. 2 Introduction What is hypertext model? Use of hypertext in World Wide Web (WWW) – HTML. WWW client-server.
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
UWWD In our quest to eliminate bad websites, we present…. HALLELUJAH!!
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
NGS Analysis Using Galaxy
DAT602 Database Application Development Lecture 15 Java Server Pages Part 1.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Chapter 11 Adding Media and Interactivity. Flash is a software program that allows you to create low-bandwidth, high-quality animations and interactive.
Web 2.0: Concepts and Applications 2 Publishing Online.
1 Web Developer & Design Foundations with XHTML Chapter 6 Key Concepts.
1 Identify the location of a particular gene, trait, QTL or marker - and the grass species they have been mapped to - on genetic, QTL, physical, sequence,
GMOD in the Cloud Genome Informatics November 3, 2011 Scott Cain GMOD Project Coordinator Ontario Institute for Cancer Research
WebGBrowse A Web Server for GBrowse Configuration Ram Podicheti B.V.Sc. & A.H. (D.V.M.), M.S. Staff Scientist – Bioinformatics Center for Genomics and.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
Dynamic Web Pages (Flash, JavaScript)
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Describe several.
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
Comparative Genomics Tools in GMOD GMOD.org Dave Clements 1, Sheldon McKay 2, Ken Youns-Clark 2, Ben Faga 3, Scott Cain 4, and the GMOD Consortium 1 National.
NASRULLAH KHAN.  Lecturer : Nasrullah   Website :
How many vegetarians are there? And... Before I do anything...
Basic features for portal users. Agenda - Basic features Overview –features and navigation Browsing data –Files and Samples Gene Summary pages Performing.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
Andy Conley 3/26/ James Kent. Know that name. He is one of greatest, perhaps the greatest, bioinformatics programmers ever. He was deeply involved.
WebApollo: A Web-Based Sequence Annotation Editor for Community Annotation Ed Lee, Gregg Helt, Nomi Harris, Mitch Skinner, Christopher Childers, Justin.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
GMOD/GBrowse_syn Sheldon McKay iPlant Collaborative DNA Learning Center Cold Spring Harbor Laboratory.
GMOD: Managing Genomic Data from Emerging Model Organisms Dave Clements 1, Hilmar Lapp 1, Brian Osborne 2, Todd J. Vision 1 1 National Evolutionary Synthesis.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Got genom e? Community Meetings GMOD.org The GMOD community meets semi- annually to discuss GMOD components, best practices,
ITCS373: Internet Technology Lecture 5: More HTML.
The generic Genome Browser (GBrowse) A combination database and interactive web page for manipulating and displaying annotations on genomes Developed by.
Use CSS to Implement a Reusable Design Selecting a Dreamweaver CSS Starter Layout is the easiest way to create a page with a CSS layout You can access.
Copyright OpenHelix. No use or reproduction without express written consent1.
GBrowse Population Display and CMap SMBE 2009 Ben Faga.
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
Preface IIntroduction Objectives I-2 Course Overview I-3 1Oracle Application Development Framework Objectives 1-2 J2EE Platform 1-3 Benefits of the J2EE.
Copyright OpenHelix. No use or reproduction without express written consent1.
What do we already know ? The rice disease resistance gene Pi-ta Genetically mapped to chromosome 12 Rybka et al. (1997). It has also been sequenced Bryan.
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
This tutorial will describe how to navigate the section of Gramene that allows you to view various types of maps (e.g., genetic, physical, or sequence-based)
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Institute for the Protection and Security of the Citizen HAZAS – Hazard Assessment ECCAIRS Technical Course Provided by the Joint Research Centre - Ispra.
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
GMOD/GBrowse_syn Sheldon McKay iPlant Collaborative DNA Learning Center Cold Spring Harbor Laboratory.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Blended HTML and CSS Fundamentals 3 rd EDITION Tutorial 2 Creating Links.
Canadian Bioinformatics Workshops
Chapter 27 WWW and HTTP.
Ensembl Genome Repository.
got genome? Community Meetings Databases Training GMOD.org
Updates and Future Direction
Yating Liu July 2018 G-OnRamp workshop
Presentation transcript:

Genome Browser The Plot Deepak Purushotham Hamid Reza Hassanzadeh Haozheng Tian Juliette Zerick Lavanya Rishishwar Piyush Ranjan Lu Wang

The Outline The Need & The Requirement The Options The Chosen One The New Age

THE NEED Why one should develop a Genome Browser

Why A Genome Browser? I want to analyze this organism

Why A Genome Browser? I want to analyze this organism Gene Functions Protein Domains Metabolic Pathways Comparative Analysis Synteny

THE REQUIREMENT What is expected out of a Genome Browser

A Genome Browser? I want something manageable

A Genome Browser!

The Genome Browser “Genome browsers facilitate genomic analysis by presenting alignment, experimental and annotation data in the context of genomic DNA sequences.” Melissa S Cline & James W Kent, 2009 Genome browsers aggregate data Taken From Andy Conley’s slides without permission

THE OPTIONS A Short Survey of the available Genome Browsers Modules

A Brief Time Travel FlyBase, SGD, MGD, and WormBase Setting up an MOD is expensive and time-consuming. The four MODs agreed in the fall of 2000 to pool their resources and to make reusable components available to the community free of charge under an open source license. The goal of this NIH-funded project, christened GMOD, is “…to generate a model organism database construction set that would allow a new model organism to be assembled by mixing and matching various components.”

GMOD

Who uses GMOD?

GMOD Components

Visualization - GBrowse

Visualization

JBrowse

GBrowse Synteny

CMAP

DATA MANAGEMENT

Chado

Tripal (

TableEdit

BioMart

InterMine

ANNOTATION

MAKER

DIYA

Galaxy

Ergatis

Apollo

REALLY EXCITING OPTION!

JBrowse Smooth, fast navigation (think Google Maps for genomes )

JBrowse Smooth, fast navigation (think Google Maps for genomes ) Supports BED, GFF, Bio::DB::*, Chado, WIG, BAM, UCSC (intron/exon structure, name lookups, quantitative plots) Relies on pre-indexing to minimize security exposure and runtime bandwidth/CPU load on the server (future versions more likely to do some server work at runtime) Has an API for customized track/glyph extensions Is stably funded by NHGRI, with many interesting innovations implemented & pending integration

Smoother UI

Most Genome browsers

How is JBrowse different?

First look: Live Demo A couple of JBrowses around the web

Types of Tracks

Pros Fast and smooth! User Friendly Works nicely on an iPad/iPhone too

Cons No user-uploaded data support Slow for big numbers of reference seqs (e.g. 5,000 annotated contigs) Few glyph options, feature tracks are limited by the facts of

What to pick?

? Tried and tested Fancy concept

THE CHOSEN ONE Gbrowse and its Features

GBrowse Most popular web based genome browser Visualize genome features along a reference sequence Open Source Highly customizable Excellent usability Rich set of “glyphs” – Genome features – Quantitative Data – Sequence Alignments

GBrowse Header Main Browser Window Track Menu

Under The Hood Client-Server Architecture GBrowse Architecture Installation Issues Input Data Configuration File Customization

Client Server Architecture 1. The user types in the URL: browser2012.biology.gatech.edu

Client Server Architecture 2. Browser interprets and sends the request to HTTP Server

Client Server Architecture 3. Web Server receives the request and “serves” the client i.e., starts Gbrowse

Client Server Architecture 4. In case of success, relevant hypertexts and multimedia is generated by accessing the database

Client Server Architecture 5. The output traverses the same path back

Client Server Architecture 5. The output traverses the same path back

Client Server Architecture 6. The whole process repeats again when the user interacts with the browser

How you see what you see Juxtaposed Images

How are so many images generated?

How you see what you see + Hyper Text files

How you see what you see Multimedia files + Hyper Text

©2002 by Cold Spring Harbor Laboratory Press Stein L D et al. Genome Res. 2002;12: GBrowse Architecture

The Bio::DB::SeqFeature database Schema

Attribute Attribute List Feature Name Type List Location List Parent2Child n n 1 1 n 1 n n n

Data file (.gff3) Reference Sequence (Chr/Clone /Contig) Source Eg: Prodigal/ Glimmer Type (sequence ontology (SO) terms) Start End Score Eg: E- value Strand Phase (0/1/2) Attributes Format: tag=value

Attributes (Data file) Different tags have predefined meanings: ID: Gives the feature a unique identifier. Useful when grouping features together (such as all the exons in a transcript). Name: Display name for the feature. This is the name to be displayed to the user. Alias: A secondary name for the feature. It is suggested that this tag be used whenever a secondary identifier for the feature is needed, such as locus names and accession numbers. Note: A descriptive note to be attached to the feature. This will be displayed as the feature's description. Alias and Note fields can have multiple values separated by commas. For example : Alias=M19211,gna-12,GAMMA-GLOBULIN Other good stuff can go into the attributes field.

Gbrowse Configuration File Global Website Settings Additional HTML Pages JavaScript Jquery Global Database Settings Data Source Definitions

Customizations

Configuration file (.conf)

Making a new Track ### TRACK CONFIGURATION ### [ExampleFeatures] feature = remark glyph = generic stranded = 1 bgcolor = orange height = 10 key = Example Features

Adding Multiple Tracks Data: Configuration: Result UI: Searchable Links Popup balloons with links

Searching for Features Gene symbols Gene IDs Sequence IDs Genetic markers Relative nucleotide coordinates Absolute nucleotide coordinates etc... click

Viewing Multiple Tracks Low Magnification

Viewing Multiple Tracks High Magnification

In short… Main features (Determination of protein coding and non-coding,…) Quantitative data (E-value, Identity percentage) Other evidences (Interpro, CoGs, etc.) GC content and other useful measurements Protein and DNA sequences

THE NEW AGE Value-Added Additions

RICHER ANNOTATION What’s New

INCREASED ANNOTATION INFO Richer Annotation

INTEGRATED QUALITY SCORE Richer Annotation

Origin of Database Matches

Quality Value Integration

Quality Scores Origin of Database Matches

Different E-values shown with different shades of colors

What’s New MORE LINK-OUTS

COGs KEGG ID

PATHWAYS What’s New

KEGG ID KEGG Genes KEGG Compound KEGG Pathway

ORGANISM SPECIFIC PAGES Synthesis!

Organism Summary Page At this point of the course, we have gathered a lot of information for the strains we are dealing with Not all of this information could be represented inside the genome browser We propose a separate section in the browser containing strain-wise summarized information

Organism Summary Page Conceptually, the page could contain: – Biological information – Assembly information: Genome Size, Number of contigs, N50, Sequencing platform – Gene Prediction information: Number of protein coding and non-protein coding genes, links to 16s rRNA gene – Annotation information: Percent annotation, function distribution pie – Comparative information: Unique protein clusters, etc.

Organism Summary Page

OPERONS Adding more values

Operons Operon “…is a functioning unit of genomic DNA containing a cluster of genes under the control of a single regulatory signal or promoter” ~70% of the genes have been assigned a unique OperonID OperonID will provide an additional browsing mechanism for biologist connecting co- transcribed and co-regulated genes.

Operons

Incorporating Operon Information

BRIG PATTERN More with Comparison

BRIG Patterns Concept: To either generate BRIG images at run time or load static images when the user requests for BRIG Pattern between two species

BRIG Patterns

That’s All Folks! Questions? Comments? Concerns? If you have any suggestions, we would love to hear from you! (There is a page on Wiki for it!)