High-throughput Biological Data The data deluge

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Advertisements

Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors : Christos Ferles ∗, Andreas Stafylopatis NN Self-Organizing Hidden Markov Model.
BIOINFORMATICS Ency Lee.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Using Bioinformatics to Make the Bio- Math Connection The Confessions of a Biology Teacher.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
Bioinformatics and Phylogenetic Analysis
Introduction to Computational Biology Topics. Molecular Data Definition of data  DNA/RNA  Protein  Expression Basics of programming in Matlab  Vectors.
Introduction to BioInformatics GCB/CIS535
Bio 465 Summary. Overview Conserved DNA Conserved DNA Drug Targets, TreeSAAP Drug Targets, TreeSAAP Next Generation Sequencing Next Generation Sequencing.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
DOE Genomics: GTL Program IT Infrastructure Needs for Systems Biology David G. Thomassen Office of Biological and Environmental Research DOE Office of.
High-throughput Biological Data The data deluge and bioinformatics algorithms Introduction to bioinformatics 2005 Lecture 3.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Introduction to bioinformatics Lecture 3 High-throughput Biological Data -data deluge, bioinformatics algorithms- and evolution C E N T R F O R I N T.
Central dogma: the story of life RNA DNA Protein.
EB3233 Bioinformatics Introduction to Bioinformatics.
Bioinformatics and Computational Biology
Introduction to bioinformatics Lecture 3 High-throughput Biological Data -data deluge, bioinformatics algorithms- and evolution C E N T R F O R I N T.
Basic Overview of Bioinformatics Tools and Biocomputing Applications II Dr Tan Tin Wee Director Bioinformatics Centre.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
High-throughput Biological Data -data deluge, bioinformatics algorithms- and evolution Introduction to bioinformatics 2005 Lecture 3.
Genomic Signal Processing Dr. C.Q. Chang Dept. of EEE.
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
High throughput biology data management and data intensive computing drivers George Michaels.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
BME435 BIOINFORMATICS.
bacteria and eukaryotes
Genome Annotation (protein coding genes)
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
The Transcriptional Landscape of the Mammalian Genome
Networks and Interactions
Data-intensive Computing: Case Study Area 1: Bioinformatics
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
Section 3: Gene Technologies in Detail
Mangaldai College, Mangaldai
Bioinformatics and BLAST
Genomes and Their Evolution
Bioinformatics: Buzzword or Discipline (???)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Bioinformatics For MNW 2nd Year
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Data Mining.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Introduction to Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Biotechnology & Bioinformatics
Presentation transcript:

High-throughput Biological Data The data deluge Hidden in these data is information that reflects existence, organization, activity, functionality …… of biological machineries at different levels in living organisms Most effectively utilising this information will prove to be essential for Integrative Bioinformatics

Data Issues …… Data collection: getting the data Data representation: data standards, data normalisation ….. Data organisation and storage: database issues ….. Data analysis and data mining: discovering “knowledge”, patterns/signals, from data, establishing associations among data patterns Data utilisation and application: from data patterns/signals to models for bio-machineries Data visualization: viewing complex data …… Data transmission: data collection, retrieval, ….. ……

Often enough: developing ad hoc tools for each individual application Bio-Data Analysis and Data Mining Existing/emerging bio-data analysis and mining tools for DNA sequence assembly Genetic map construction Sequence comparison and database searching Gene finding …. Gene expression data analysis Phylogenetic tree analysis to infer horizontally-transferred genes Mass spec. data analysis for protein complex characterization …… Current mode of work: Often enough: developing ad hoc tools for each individual application

Bio-Data Analysis and Data Mining As the amount and types of data and their cross connections increase rapidly the number of analysis tools needed will go up “exponentially” blast, blastp, blastx, blastn, … from BLAST family of tools gene finding tools for human, mouse, fly, rice, cyanobacteria, ….. tools for finding various signals in genomic sequences, protein-binding sites, splice junction sites, translation start sites, …..

Bio-Data Analysis and Data Mining Many of these data analysis problems are fundamentally the same problem(s) and can be solved using the same set of tools: e.g. clustering or optimal segmentation by Dynamic Programming Developing ad hoc tools for each application (by each group of individual researchers) may soon become inadequate as bio-data production capabilities further ramp up

HOWEVER in biology one size does NOT fit all… Bio-data Analysis, Data Mining and Integrative Bioinformatics To have analysis capabilities covering wide range of problems, we need to discover the common fundamental structures of these problems; HOWEVER in biology one size does NOT fit all… Goal is development of a data analysis infrastructure in support of Genomics and beyond

Algorithms in bioinformatics • string algorithms • dynamic programming • machine learning (Neural Netsworks, k-Nearest Neighbour, Support Vector Machines, Genetic Algorithm, ..) • Markov chain models • hidden Markov models • Markov Chain Monte Carlo (MCMC) algorithms • stochastic context free grammars • EM algorithms • Gibbs sampling • clustering • tree algorithms • text analysis • hybrid/combinatorial techniques and more…

Sequence analysis and homology searching

Finding genes and regulatory elements

Expression data

Functional genomics • Monte Carlo

Protein translation