TRANSFAC Project Roadmap Discussion.  Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating.

Slides:



Advertisements
Similar presentations
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Synthetic approaches to transcription factor regulation and function Tim Johnstone BIOL1220 Spring 2010.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
InterPro/prosite UCSC Genome Browser Exercise 3. Turning information into knowledge  The outcome of a sequencing project is masses of raw data  The.
1 Alternative Splicing. 2 Eukaryotic genes Splicing Mature mRNA.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Fuzzy K means.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Bioinformatics Basics Cyrus Courtesy from LO Leung Yau’s original presentation.
Identifying conserved promoter motifs and transcription factor binding sites in plant promoters Endre Sebestyén, ARI-HAS, Martonvásár, Hungary 26th, November,
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Making Sense of the ENCODE Project (ENCyclopedia Of DNA Elements) Data Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences.
MotifML A Novel Ontology-based XML Model for Data- Exchange of Regulatory DNA Motif Profiles Eric Neumann, Beyond Genomics Tian Niu, Harvard University.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Chapter 11 Table of Contents Section 1 Control of Gene Expression
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Finish up array applications Move on to proteomics Protein microarrays.
Copyright OpenHelix. No use or reproduction without express written consent1.
Comparative analysis of eukaryotic genes Mar Albà Barcelona Biomedical Research Park.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
The TRANSFAC ® System comprises 7 databases: TRANSFAC ® Professional Suite TRANSFAC ® Professional Transcription factor database TRANSCompel ® Professional.
Comparative Genomics Gene Regulatory Networks (GRNs) Anil Jegga Biomedical Informatics Contact Information: Anil Jegga Biomedical Informatics Room # 232,
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Mining the Biomedical Research Literature Ken Baclawski.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
1 Bioinformatics at Norwegian University of Science and Technology Professor Finn Drabløs Department of Cancer Research and Molecular Medicine Finn Drabløs.
Exploring and Exploiting the Biological Maze Zoé Lacroix Arizona State University.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
RNA-ligand interactions and control of gene expression
CS173 Lecture 9: Transcriptional regulation III
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Intro to Probabilistic Models PSSMs Computational Genomics, Lecture 6b Partially based on slides by Metsada Pasmanik-Chor.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Finding genes in the genome
Accessing and visualizing genomics data
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
BIOBASE Training TRANSFAC ® Containing data on eukaryotic transcription factors, their experimentally-proven binding sites, and regulated genes ExPlain™
Regulation of Gene Expression
Regulation of Gene Expression
High-throughput Biological Data The data deluge
Chapter 12.5 Gene Regulation.
Dennis Shasha, Courant Institute, New York University With
Relationship between Genotype and Phenotype
Introduction to Bioinformatics II
9 Future Challenges for Bioinformatics
BIOBASE Training TRANSFAC® ExPlain™
Relationship between Genotype and Phenotype
Presentation transcript:

TRANSFAC Project Roadmap Discussion

 Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating domain (TAD) An optional signal sensing domain (SSD)  DNA binding domain There are families Transcription Factors

Transcription Factors  Classes by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology

TRANSFAC Revisited  TRANSFAC efforts Thanks to Kent and Ricky ’ s efforts, we have a good view of the tables related to TFBSs TRANSFAC information can be queried with customized SQL commands TFBSs data can be extracted according to our needs

TRANSFAC Revisited  TRANSFAC suite TRANSCompel ® - specialized data on composite regulatory elements TRANSCompel ® TRANSPro ™ - a substantial collection of human, mouse and rat promoter sequences TRANSPro ™ PathoDB ® - a comprehensive database on pathologically relevant mutations in transcription factors or their binding sites PathoDB ® S/MARt DB ™ - widely-researched data on the scaffold or matrix attached regions (S/MARs) of eukaryotic genomes and their binding proteins S/MARt DB ™

TRANSFAC Revisited  TRANSFAC suite TRANSCompel ® TRANSCompel ® TRANSPro ™ TRANSPro ™ PathoDB ® PathoDB ® S/MARt DB ™ S/MARt DB ™ X X

What TRANSFAC has addtionally  TRANSPRO Promoters!  MATCH (the searching algorithm) Matrix-based search With Tissue- (or state-) specific profiles  GENE (direct) links between factor and target gene; Between gene and encoded factor

Beyond TRANSFAC  CYTOMER (with links to TRANSFAC) Anatomical structures: organs, cells Developmental stages: physiological systems Description of expression patterns based on ESTs (1d)  TRANSPATH Pathways (gene regulation)  Many external links to other DBs

Preliminary Tasks  Data collection Mastering of the TRANSFAC data organization and structures Collection of external data, e.g. genomes, ESTs, from GeneBank, Ensembl, SCPD, etc. Data generation for different purposes, e.g. modeling, data mining, datasets

Preliminary Tasks  Data mining/analysis Statistics related to TFBSs, or more  Approximate matching, n-grams involved Global features discovery Understanding of the TFBS mechanisms  Database Curation Data cleansing Species specific Novel TFBSs retrieval

Preliminary Tasks  Motif discovery New models (statistical) New motif discovery algorithms  Gene Network Analysis Making use of the links between FACTOR and GENE, as well as the expression information in CYTOMER  More …

Task assignment (very preliminary) TasksRemarksMembers Data Collection TRANSFAC structure; External data (Bio knowledge); dataset generation Ricky, Shaoke, Cyrus Data Mining Statistics (largely); Approximate matching Peter, Ricky, David, Ni Bing Database Curation Bio knowledge involved: species, tissue-specific, … Indexing involved; application development Shaoke, David, Ricky Motif Discovery Modeling; Repersentations; Algorithms Peter, Li Gang, Cyrus Gene Networks Incorporating TRANSFAC knowledge Li Gang, Cyrus More e.g. PathoDB about SNPs Phoenix

Discussions  Assignment of tasks  Thanks