Genomics and Personalized Care in Health Systems Lecture 5 Genome Browser Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health.

Slides:



Advertisements
Similar presentations
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Advertisements

Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Psi-BLAST, Prosite, UCSC Genome Browser Lecture 3.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
InterPro/prosite UCSC Genome Browser Exercise 3. Turning information into knowledge  The outcome of a sequencing project is masses of raw data  The.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Copyright OpenHelix. No use or reproduction without express written consent1.
Lab 3.41 Demo: Exploiting the UCSC Genome Browser Stefanie Butland UBC Bioinformatics Centre
UCSC Genome Browser Tutorial
How to access genomic information using Ensembl August 2005.
Prosite and UCSC Genome Browser Exercise 3. Protein motifs and Prosite.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Genome Evolution: Duplication (Paralogs) & Degradation (Pseudogenes)
Working with the Conifer_dbMagic database: A short tutorial on mining conifer assembly data. This tutorial is designed to be used in a “follow along” fashion.
1 Identify the location of a particular gene, trait, QTL or marker - and the grass species they have been mapped to - on genetic, QTL, physical, sequence,
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 1.
Spring 2006, v7 Copyright OpenHelix. No use or reproduction without express written consent 1 The UCSC Genome Browser Search, retrieve and display the.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Copyright OpenHelix. No use or reproduction without express written consent1.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
1 The Genome Browser allows you to –Browse the Rice-Japonica, Maize and Arabidopsis genomes. –View the location of a particular feature on the rice genome.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
The UCSC Genome Browser Introduction
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Copyright OpenHelix. No use or reproduction without express written consent1.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Professional Development Course 1 – Molecular Medicine Genome Biology June 12, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services.
Copyright OpenHelix. No use or reproduction without express written consent1.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Sackler Medical School
Copyright OpenHelix. No use or reproduction without express written consent1.
Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Wilson Leung08/2015.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
The UCSC Table Browser & Custom Tracks Advanced searching and discovery using the UCSC Table Browser and Custom Tracks Osvaldo Graña CNIO Bioinformatics.
数据库使用 杨建华 2010/9/28. Outline of the Topics UCSC and Ensembl Genome Browser (Blat vs Blast vs Blastz vs Multiz) 挖掘数据用 Table Browser 或 BioMart 用户友好化你的数据.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
Copyright OpenHelix. No use or reproduction without express written consent1.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Copyright OpenHelix. No use or reproduction without express written consent1.
What is BLAST? Basic BLAST search What is BLAST?
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Web Databases for Drosophila
What is BLAST? Basic BLAST search What is BLAST?
Basics of BLAST Basic BLAST Search - What is BLAST?
TSS Annotation Workflow
GEP Annotation Workflow
Visualization of genomic data
Basic Local Alignment Search Tool
Presentation transcript:

Genomics and Personalized Care in Health Systems Lecture 5 Genome Browser Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management

Genome Browser Genome Browser is a computer program which helps to display gene maps, browse the chromosomes, align genes or gene models with ESTs or contigs etc. Big Three: –UCSC Genome Browser –NCBI Mapviewer –Ensemble

UCSC Genome Browser:

NCBI Mapviewer

Ensemble

The UCSC Genome Browser Slides adopted from OpenHelix training materials

UCSC Genome Browser

Genome Browser Gateway Use this Gateway to search by: –Gene names, symbols, IDs –Chromosome number: chr7, or region: chr11: –Keywords: kinase, receptor See lower part of page for help with format

Genome Browser Gateway

The Genome Browser Gateway Make your Gateway choices: 1.Select Clade 2.Select genome = species: search 1 species at a time 3.Assembly: the official backbone DNA sequence 4.Position: location in the genome to examine 5.Image width: how many pixels in display window; 5000 max 6.Configure: make fonts bigger + other choices assembly 6

The Genome Browser Gateway Sample search: human, March 2006 assembly, tp53 select Select from results list ID search may go right to a viewer page, if unique

Sample Genome Viewer Image, TP53 Region base position UCSC genes RefSeq genes mRNAs & ESTs repeats many species compared SNPs single species compared MGC clones

Visual Cues on the Genome Browser Track colors may have meaning—for example, UCSC Gene track: If there is a corresponding PDB entry = black If there is a corresponding reviewed/validated seq = dark blue If there is a non-RefSeq seq = lightest blue Tick marks; a single location (STS, SNP) For some tracks, the height of a bar is increased likelihood of an evolutionary relationship (conservation track) Intron and direction of transcription >> < exon < < << < ex 5' UTR3' UTR Alignment indications (Conservation pairs: “chain” or “net” style) Alignments = boxes, Gaps = lines

Options for Changing Images: Upper Section Change your view or location with controls at the top Use “base” to get right down to the nucleotides Configure: to change font, window size, more… –Next item, next exon navigation assistance can be turned on Specify a position Fonts, window, next item, more Walk left or right Zoom in Zoom out Click to zoom 3x and re-center

Annotation Track Display Options Some data is ON or OFF by default Menu links to info about the tracks: content, methods You change the view with pulldown menus After making changes, REFRESH to enforce the change enforce change s Enforce changes Change track view Links to info and/or filters

Annotation Track Options Defined Hide: removes a track from view Dense: all items collapsed into a single line Squish: each item = separate line, but 50% height Pack: each item separate, but efficiently stacked (full height) Full: each item on separate line

Mid-page Options to Change Settings You control the views Use pulldown menus Configure options page Reset, back to defaults Start from scratch Enforce any changes (hide, full, squish…) Flip display to Genomic 3’  5’

Cookies and Sessions Your browser remembers where you were (cookies) To clear your “cart” or parameters, click default tracks or reset OR Save your setup as “sessions” and store/share them

Click Any Viewer Object for Details Example: click your mouse anywhere on the TP53 line Click the item New description web page opens Many details and links to more data about TP53

Get DNA, with Extended Case/Color Options Use the DNA link at the top Plain or Extended options Change colors, fonts, etc.

Base Level and Protein Sequences

BLASTX Search to Confirm the Protein

Get Sequence from Details Pages Click a track, go to Sequence section of details page Click the item sequence section on detail page

Accessing the BLAT Tool BLAT = BLAST-like Alignment Tool – Rapid searches by INDEXING the entire genome – Works best with high similarity matches

BLAT BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more. It may miss more divergent or shorter sequence alignments. It will find perfect sequence matches of 25 bases, and sometimes find them down to 20 bases. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more. In practice DNA BLAT works well on primates, and protein BLAT on land vertebrates BLAT works by keeping an index of the entire genome in memory. The index consists of all non-overlapping 11-mers except for those heavily involved in repeats.

BLAT Search Make choices DNA limit bases Protein limit aa 25 total sequences Paste one or more sequences Or upload submit

BLAT Results with Hyperlinks Results with demo sequences, settings default; sort = Query, Score –Score is a count of matches—higher number, better match Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2 nd slide) sorting go to browser/viewergo to alignment detail

BLAT Results: Browser From browser click in BLAT results A new line with Your Sequence from BLAT Search appears! Base position = “full” menu and zoomed in enough to see amino acids in 3 frame translation query

BLAT Results, Alignment Details Your query Genomic match, color cues Side by Side Alignment yours genomic

Summary UCSC Genome Browser Visual cues and genomic context Many ways to alter your views Access to deeper data Access and use sequence data

UCSC Table Browser

The Table Browser Open browser

Table Browser

34 Many Other Databases Use UCSC Genome Browser Mirror and Software Malaria: / Arabidopsis: Archaea: GSID HIV Browser: GEP Drosophila Genome Browser : …

GEP Drosophila Genome Browser UCSC Genome Browser, GEP version, parts of genomes, GEP data, used for annotation of Drosophila species – Male Drosophila melanogaster

Drosophila melanogaster Chromosomes

Fruit Flies and Human Disease Research About 75% of known human disease genes have a recognizable match in the genetic code of fruit flies, and 50% of fly protein sequences have mammalian analogues. An online database called Homophila is available to search for human disease gene homologues in flies and vice versa.Homophila Drosophila is being used as a genetic model for several human diseases including the neurodegenerative disorders Parkinson's, Huntington's, spinocerebellar ataxia and Alzheimer's disease. The fly is also being used to study mechanisms underlying aging and oxidative stress, immunity, diabetes, and cancer, as well as drug abuse.

Homework 4 Read through the BLAST tutorial (IntroToBLAST.zip, A simple Introduction to NCBI BLAST) and follow the instructions to reproduce the results described in the tutorial. List the steps you have taken and indicate whether you find any differences from the results mentioned in the tutorial. Use the sequence of BRCA1 gene, run a BLAT search against human genome (the most recent assembly, GRCh37), select the best sequence alignment result and view the output in the genome browser. You should provide a screen shot of the obtained page, which should include at least the gene, its homolog genes, other refseq, mRNA, the gene in other species, SNPs, and repeats. – Obtain mRNA-Genomic Alignments record from the browser – Obtain the predicted protein sequence from the browser – Obtain the precise location of one SNP record in the genome sequence – Zoom in to the base level and determine the protein sequence corresponding to one well conserved exon; get the DNA sequence of the exon, run a blastx search (do not apply low complexity filtering) to confirm the correctness of the protein sequence you obtain