ClueGene: An Online Search Engine for Querying Gene Regulation

Slides:



Advertisements
Similar presentations
Working with Forms. how are forms manipulated? the document object contains an array of forms objects, one for each form, in document order –forms[] any.
Advertisements

Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Gene Set Enrichment Analysis (GSEA)
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Using Entities & Creating Forms Jill R. Sommer Institute for Applied Linguistics Kent State University.
Tutorial 6 Forms Section A - Working with Forms in JavaScript.
Lecture 5 Geocoding. What is geocoding? the process of transforming a description of a location—such as a pair of coordinates, an address, or a name of.
Web Development & Design Foundations with XHTML Chapter 9 Key Concepts.
1 Web Developer & Design Foundations with XHTML Chapter 6 Key Concepts.
Chapter 6: Forms JavaScript - Introductory. Previewing the Product Registration Form.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
The CompleteSearch Engine: Interactive, Efficient, and Towards IR&DB Integration Holger Bast, Ingmar Weber Max-Planck-Institut für Informatik CIDR 2007)
JavaScript: Functions © by Pearson Education, Inc. All Rights Reserved.
Introduction to Applets CS 3505 Client Side Scripting with applets.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
A/WWW Enterprises15 July 1996 Implementing Queries with HTTP A. Warnock A/WWW Enterprises
Chapter 6 Server-side Programming: Java Servlets
Personalized Search Xiao Liu
Computers and Scientific Thinking David Reed, Creighton University Functions and Libraries 1.
1 © Netskills Quality Internet Training, University of Newcastle HTML Forms © Netskills, Quality Internet Training, University of Newcastle Netskills is.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
XHTML & Forms. PHP and the WWW PHP and HTML forms – Forms are the main way users can interact with your PHP scrip Typical usage of the form tag in HTML.
Server-side Programming The combination of –HTML –JavaScript –DOM is sometimes referred to as Dynamic HTML (DHTML) Web pages that include scripting are.
 Previous lessons have focused on client-side scripts  Programs embedded in the page’s HTML code  Can also execute scripts on the server  Server-side.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Form Processing Week Four. Form Processing Concepts The principal tool used to process Web forms stored on UNIX servers is a CGI (Common Gateway Interface)
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Part 5 Advanced topics in CGI/Perl Psychological Science on the Internet: Designing Web-Based Experiments From the Ground Up R. Chris Fraley | APS 2006.
Javascript Overview. What is Javascript? May be one of the most popular programming languages ever Runs in the browser, not on the server All modern browsers.
MapReduce and the New Software Stack. Outline  Algorithm Using MapReduce  Matrix-Vector Multiplication  Matrix-Vector Multiplication by MapReduce 
CSC 121 Computers and Scientific Thinking Fall Event-Driven Programming.
HTML5 and CSS3 Illustrated Unit C: Getting Started with CSS.
HTML III (Forms) Robin Burke ECT 270. Outline Where we are in this class Web applications HTML Forms Break Forms lab.
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
REEM ALMOTIRI Information Technology Department Majmaah University.
Galaxy for analyzing genome data Hardison October 05, 2010
Information Retrieval in Practice
Managing State Chapter 13.
Event-Driven Programming
Getting Started with CSS
Algorithms and Problem Solving
HTML5 and CSS3 Illustrated Unit D: Formatting Text with CSS
Section 6.3 Server-side Scripting
Cascading Style Sheets
Intro to JavaScript CS 1150 Spring 2017.
Map Reduce.
JavaScript Functions.
Haritha Dasari Josue Balandrano Coronel -
JavaScript: Functions.
The Cliff Notes Version
Dynamic Web Pages (Flash, JavaScript)
7 Best Programming Languages Based as per Earnings & Opportunities
MG4J – Managing GigaBytes for Java Introduction
MSIS 655 Advanced Business Applications Programming
Chapter 27 WWW and HTTP.
Using Access to Implement a Relational Database
Event Driven Programming & User Defined Functions
A Web-Based Tool for Gathering Ordinal Rankings
Chapter 6 Event-Driven Pages
Brian Kotek INDUS Corporation
Teaching slides Chapter 6.
What is Perl? PERL--Practical Extraction and Report Language
Chapter 2: Intro to Relational Model
Chapter 7 Event-Driven Pages
ASP.NET MVC Web Development
Lab 2: Information Retrieval
WSExpress: A QoS-Aware Search Engine for Web Services
© 2017, Mike Murach & Associates, Inc.
Presentation transcript:

ClueGene: An Online Search Engine for Querying Gene Regulation David M. Ng 2008 January 16

System Overview Every operation generates a “working set” that can be modified and used as the query in the next search iteration Common structure for all search and test operations with no dead ends

New Features Coexpression test Dataset ranking and heat map Heat map for expression data

Coexpression Test Coexpression search performed using half of the working set selected at random AUC computed based on finding the held-out half of the working set Coexpression test score is the average of ten such searches Test score displayed in the context of representative pathways with scores computed the same way as a “thermometer” Precision-recall curves are also displayed

Dataset Ranking and Heat Map Datasets are ranked by their contribution to the scores of the working set genes Display as a heat map Future work: allow user to provide dataset feedback

Expression Data Heatmap Displays the expression data for a dataset For the following genes Result genes Query genes Contrast genes Randomly selected non-query and non-result genes Same number as number of result genes

Expression Data Heat Map Script Generate a heat map as a Web page for specified query, result, and contrast genes for a given dataset. Usage: Invoke as a URL: http://sysbio.soe.ucsc.edu/cgi-bin/ClueGeneProd/cluegene_heatmap.pl Specify parameters following a ? Parameters are name-value pairs separated by ampersands

Expression Data Heat Map Script Parameters species=<species code> ds=<dataset name> transactionId=<transaction id> <result gene id>=resultGene <query gene id>=queryGene <contrast gene id>=contrastGene

Expression Data Heat Map Example http://sysbio.soe.ucsc.edu/cgi-bin/ClueGeneProd/cluegene_heatmap.pl? ds=Segal03&species=sce&transactionId=1200474871417.4& YJR123W=resultGene&YLR340W=resultGene&YNL301C=resultGene& YJR123W=queryGene&YLR340W=queryGene&YBL072C=queryGene& YNL232W=contrastGene&YDL175C=contrastGene&YDL104C=contrastGene

Invoking ClueGene via URL ClueGene provides a GET interface

Future Work Dataset selection Reimplement Set-based user model

Reimplement ClueGene Current ClueGene Hard to maintain 10,000+ lines of Perl in 20 files 800+ lines of HTML and JavaScript Hard to maintain Old CGI technology

Set-Based User Model Generalization of Greg’s Gene Sets and Gene Set Families Set members can be atomic or sets Set members have attributes Intrinsic to the element Dependent on the set under consideration Issue: combining duplicate attributes

Benefits of Set Model A single, consistent model for all aspects of gene search engines Easier understanding of inputs, operations, and results More straightforward user interface implementation More general manipulation of sets supports saving/loading of sets combining result sets via set operations such as intersection and union

ClueGene Sets Gene: atom Cluster: set of genes Attributes such as unique id, display name, aliases Cluster: set of genes Dataset: set of cluster sets Cluster compendium: set of dataset sets Query set: set of genes Expected set: set of genes

ClueGene Query Inputs Output Computing AUC Cluster compendium set Query set Output Set of all genes in the genome Set-specific attributes for rank and score Computing AUC Additional input: expected set Result AUC: attribute of result set

Other Operations Known and Novel Motif Search GO Category Search Input: Working set Output: Set of {set for each result motif containing the genes with the motif} GO Category Search

Clustering Expression data: set of genes Clustering Set-specific attributes for expression data for each gene Clustering Input expression data: set of genes of expression data Output dataset: set of cluster sets Issue: handling operations that take a really long time