Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.

Slides:



Advertisements
Similar presentations
FP7 meeting - Gent - Carlos Rodríguez - April 18 WP4: Conceptual Mining from Text for Knowledge Engineering State of the Art WP Coordinators: Alfonso Valencia.
Advertisements

An Information Retrieval and Extraction System for C. elegans Literature.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Information Retrieval in Practice
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
Gene Ontology Luis Tari. Gene Ontology (GO) URL: Gene Ontology is A hierarchy of roles of genes.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Biological Gene and Protein Networks
TRANSFAC Project Roadmap Discussion.  Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
CBioC: Massive Collaborative Curation of Biomedical Literature Future Directions.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Mining the Medical Literature Chirag Bhatt October 14 th, 2004.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Immune Cell Ontology for Networks (ICON) Immunology Ontologies and Their Applications in Processing Clinical Data June 11-13, Buffalo, NY.
Cis-Regulatory/ Text Mining Interface Discussion.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact:
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
Automatic methods for functional annotation of sequences Petri Törönen.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Functional Associations of Protein in Entire Genomes Sequences Bioinformatics Center of Shanghai Institutes for Biological Sciences Bingding.
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Networks and Interactions Boo Virk v1.0.
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Discovering Gene-Disease Association using On-line Scientific Text Abstracts. Raj Adhikari Advisor: Javed Mostafa.
1 Bio-Trac 40 (Protein Bioinformatics) October 8, 2009 Zhang-Zhi Hu, M.D. Associate Professor Department of Oncology Department of Biochemistry and Molecular.
Construction of cancer pathways for personalized medicine | Presented By Date Construction of cancer pathways for personalized medicine Predictive, Preventive.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
A Biology Primer Part IV: Gene networks and systems biology Vasileios Hatzivassiloglou University of Texas at Dallas.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Algorithmic Detection of Semantic Similarity WWW 2005.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
UIC at TREC 2006: Genomics Track Wei Zhou, Clement T. Yu University of Illinois at Chicago Nov. 16, 2006.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
A System for Finding Biological Entities that Satisfy Certain Conditions from Texts Wei Zhou, Clement Yu University of Illinois at Chicago Weiyi, Meng.
Bioinformatics and Computational Biology
Databases, Ontologies and Text mining Session Introduction Part 2 Carole Goble, University of Manchester, UK Dietrich Rebholz-Schuhmann, EBI, UK Philip.
A literature network of human genes for high-throughput analysis of gene expression Speaker : Shih-Te, YangShih-Te, Yang Advisor : Ueng-Cheng, YangUeng-Cheng,
1 Bioinformatics at Norwegian University of Science and Technology Professor Finn Drabløs Department of Cancer Research and Molecular Medicine Finn Drabløs.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Alzheimer’s Disease and the influence of Presenilin 1 By Tony Cortez.
DISCUSSION Using a Literature-based NMF Model for Discovering Gene Functional Relationships Using a Literature-based NMF Model for Discovering Gene Functional.
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
Information Retrieval in Practice
Development of the Amphibian Anatomical Ontology
Functional Annotation of the Horse Genome
Genome Annotation Continued
Albert Xue, Binbin Huang, Jianrong Wang
Relationship between Genotype and Phenotype
Electrical and Computer Engineering Department
Batyr Charyyev.
Network biology An introduction to STRING and Cytoscape
Relationship between Genotype and Phenotype
Presentation transcript:

Class Projects

Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning GRN with feedback loops;

Learning from multiple data sources  We have gene expression data and topological ordering information;  Incorporating some other data sources as prior knowledge for the learning; Transcription factor binding location data; … Example: Partial regulatory network recovered using expression data and location data.

Learning Causality in Motifs They be used to assemble a transcriptional regulatory network. Network motifs are the simplest units of network architecture.

Learning GRN with feedback loops

Learning GRN with feedback loops (Con’dProtein-Protein Interactions

Future work and Possible Project Topics in protein interaction  Learning from multiple data sources;  Disease related protein-protein interactions;  Learning from different species;

Learning from Multiple data sources (a)Gene Neighbor: identifies protein pair encoded in close proximity across multiple genomes. (b)Rosetta Stone (c)Phylogenetic Profile (d)Gene Clustering: closely spaced genes, and assigns a probability P of observing a particular gap distance

Disease related protein-protein interactions; Disease Related??? -- Query NCBI OMIM Database

Learning from different species

BioQA related projects

Projects for BioQA 1.Learning Given a set of relevant abstracts, what kind of features can we obtain to enhance our queries? Given a set of questions from users, how can we identify keywords from the questions to form queries? 2.Answer Presentation Given a relevant abstract/article, how can we retrieve the relevant passage with respect to the user’s question? how to extract answers?

Projects for BioQA 3.Automatic Extraction Extract relations of gene-disease, gene-biological process (also their corresponding organisms) Uniquely identify the genes A gene symbol can be associated with multiple gene identifiers. Which gene identifier is the right one? Can these extraction processes be generalized? 4.Sortal Resolution Given an abstract and query, perform sortal resolution (but not on pronouns) Example: Given the following abstract:  “In this report, we show that virus infection of cells results in a dramatic hyperacetylation of histones H3 and H4 that is localized to the IFN-beta promoter. … Thus, coactivator-mediated localized hyperacetylation of histones may play a crucial role in inducible gene expression. [PMID: ] and the query about histones, perform resolution on histones Results: histones refer to H3, H4.

Projects for BioQA 5.Semantics of Words Dealing with the semantics of words to improve the retrieval of answers Example: semantic relation between “role” and “play” 6.Gene symbol variants, disambiguate gene symbols, entity recognition Generate gene symbol synonyms and variants given a gene symbol in a query Example: variants of “CDC28” can be written as “Cdc28”, “Cdc28p”, “cdc-28” “GSS” is a synonym of “PRNP”, but “GSS” itself is also a gene which is unrelated to “PRNP”. Improve on recognition of diseases, biological processes 7.Extension of Ontology To capture biological processes and their possible relations to diseases Examples: learning and/or memory can influence Alzheimer’s disease Degradation of ubiquitin cycle can cause extra long/short half-life of genes Extra long/short half-life of genes can cause cancer

CBioC Class Projects Extraction of organism info for each entity in a relationship High-priority. Use existing software for extraction, but need to use biological databases and algorithms for deducing info (not explicit), and allow users to correct this info. Example, PMID Example KALPESH Image extension - extracts images & information about images and allows collaborative curation. Take PDFs & other structured documents, and extract images with their captions & references within the text, then let users polish. Related.Related Use ontologies and some automated tools to ensure consistency and cross-link info 2 people. Information entered by users needs to be validated against existing DB & ontologies. Also, need to tag our data for cross-reference. ExampleExample

Other projects

Build an Ontology  Build an ontology for a domain for which we do not have an ontology yet.  Verify its consistency.

Various kinds of text extraction systems  TREC suggested ones Which method/protocol is used in which experiment/procedure Gene – disease – role Gene – biological process – role Gene – mutation type – biological impact Gene – interaction – gene – function – organ Gene – interaction – gene – disease – organ  Protein Lounge inspired Kinase-phosphatase transcription factor peptide antigen

Drug classification in Pharmacogenetics Experimental Data available Drug response on cell lines; gene expression data; gene copy data; mutation analysis data; RNAi data Data from literature Mutation data (Sanger lab); NCI-60 drug response data; Mutation analysis data; Pathway data (e.g. BIND); Gene Ontology Proprietary data Where does the drug physically interact? (600 Kinase – IC 50) Gene expression data of patients after treatments Goal: Given a patient, what kinds of data do we need in order to determine if a drug should be applicable to that patient or not? How do we develop a classifier using these kinds of data? Find gene and protein interaction network (or components) using these data.