Genomic Signal Processing Dr. C.Q. Chang Dept. of EEE.

Slides:



Advertisements
Similar presentations
Introduction to Genetic Analysis TENTH EDITION Introduction to Genetic Analysis TENTH EDITION Griffiths Wessler Carroll Doebley © 2012 W. H. Freeman and.
Advertisements

Recombinant DNA Technology
Recombinant DNA technology
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Mathematical Statistics, Centre for Mathematical Sciences
Gene Expression Chapter 9.
Gene expression analysis summary Where are we now?
Microarrays Dr Peter Smooker,
Gene ontology & hypergeometric test Simon Rasmussen CBS - DTU.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Additional Powerful Molecular Techniques Synthesis of cDNA (complimentary DNA) Polymerase Chain Reaction (PCR) Microarray analysis Link to Gene Therapy.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
Introduction to Computational Biology Topics. Molecular Data Definition of data  DNA/RNA  Protein  Expression Basics of programming in Matlab  Vectors.
Introduction to BioInformatics GCB/CIS535
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
10 Genomics, Proteomics and Genetic Engineering. 2 Genomics and Proteomics The field of genomics deals with the DNA sequence, organization, function,
Bacterial Physiology (Micr430)
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
1 Characterization, Amplification, Expression Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis of DNA.
Introduce to Microarray
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Chromosomes carry genetic information
CISC667, F05, Lec27, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Review Session.
Analysis of microarray data
Real Time PCR = Quantitative PCR.
HC70AL Spring 2009 Gene Discovery Laboratory RNA and Tools For Studying Differential Gene Expression During Seed Development 4/20/09tratorp.
Biochemistry Lecture 4.
Whole Genome Expression Analysis
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
CDNA Microarrays MB206.
Data Type 1: Microarrays
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Section 2 Genetics and Biotechnology DNA Technology
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
 The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding.
DNA TECHNOLOGY AND GENOMICS CHAPTER 20 P
Monday Human and chimp DNA is ~98.7 similar, But, we differ in many and profound ways, Can this difference be attributed, at least in part, to differences.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
MCB 317 Genetics and Genomics Topic 11 Genomics. Readings Genomics: Hartwell Chapter 10 of full textbook; chapter 6 of the abbreviated textbook.
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Microarrays and Gene Expression Arrays
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Disease Diagnosis by DNAC MEC seminar 25 May 04. DNA chip Blood Biopsy Sample rRNA/mRNA/ tRNA RNA RNA with cDNA Hybridization Mixture of cell-lines Reference.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
Chapter 20 DNA Technology and Genomics. Biotechnology is the manipulation of organisms or their components to make useful products. Recombinant DNA is.
A Molecular Toolkit AP Biology Fall The Scissors: Restriction Enzymes  Bacteria possess restriction enzymes whose usual function is to cut apart.
Notes: Human Genome (Right side page)
Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.
Microarray: An Introduction
Green with envy?? Jelly fish “GFP” Transformed vertebrates.
©2003/04 Alessandro Bogliolo Analysis of gene expression by means of Microarrays.
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
Vectors Bacteria, viruses or liposomes into which DNA can be inserted. These can be used to grow genes, harvest the proteins they code for or deliver them.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Microarray Technology and Applications
High-throughput Biological Data The data deluge
Data Type 1: Microarrays
Presentation transcript:

Genomic Signal Processing Dr. C.Q. Chang Dept. of EEE

Outline Basic Genomics Signal Processing for Genomic Sequences Signal Processing for Gene Expression Resources and Co-operations Challenges and Future Work

Basic Genomics

Genome Every human cell contains 6 feet of double stranded (ds) DNA This DNA has 3,000,000,000 base pairs representing 50, ,000 genes This DNA contains our complete genetic code or genome DNA regulates all cell functions including response to disease, aging and development Gene expression pattern: snapshot of DNA in a cell Gene expression profile: DNA mutation or polymorphism over time Genetic pathways: changes in genetic code accompanying metabolic and functional changes, e.g. disease or aging.

Gene: protein-coding DNA Protein mRNA DNA transcription translation CCTGAGCCAACTATTGATGAA PEPTIDEPEPTIDE CCUGAGCCAACUAUUGAUGAA

In more detail (color ~state)

Signal Processing for Genomic Sequences

The Data Set

The Problem Genomic information is digital letters A, T, C and G Signal processing deals with numerical sequences, character strings have to be mapped into one or more numerical sequences Identification of protein coding regions Prediction of whether or not a given DNA segment is a part of a protein coding region Prediction of the proper reading frame Comparing to traditional methods, signal processing methods are much quicker, and can be even more accurate in some cases.

Sequence to signal mapping

Signal Analysis Spectral analysis (Fourier transform, periodogram) Spectrogram Wavelet analysis HMT: wavelet-based Hidden Markov Tree Spectral envelope (using optimal string to numerical value mapping)

Spectral envelope of the BNRF1 gene from the Epstein-Barr virus (a)1 st section (1000bp), (b) 2 nd section (1000bp), (c) 3 rd section (1000bp), (d) 4 th section (954bp) Conjecture: the 4 th quarter is actually non-coding

Signal Processing for Gene Expression

Biological Question Sample preparation Microarray Life Cycle Data Analysis & Modeling Microarray Reaction Microarray Detection Taken from Schena & Davis

cDNA clones (probes) PCR product amplification purification printing microarray Hybridise target to microarray mRNA target) excitation laser 1laser 2 emission scanning analysis overlay images and normalise 0.1nl/spot

Image Segmentation Simple way: fixed circle method Advanced: fast marching level set segmentation Advanced Fixed circle

Clustering and filtering methods Principal approaches: Hierarchical clustering (kdb trees, CART, gene shaving) K-means clustering Self organizing (Kohonen) maps Vector support machines Gene Filtering via Multiobjective Optimization Independent Component Analysis (ICA) Validation approaches: Significance analysis of microarrays (SAM) Bootstrapping cluster analysis Leave-one-out cross-validation Replication (additional gene chip experiments, quantitative PCR)

ICA for B-cell lymphoma data Data: 96 samples of normal and malignant lymphocytes. Results: scatter-plotting of 12 independent components Comparison: close related to results of hierarchical clustering

Resources and Co-operations Resources: databases on the internet such as GeneBank ProteinBank Some small databases of microarray data Co-operations in need: First hand microarray data Biological experiment for validation

Challenges and Future Work Genomic signal processing opens a new signal processing frontier Sequence analysis: symbolic or categorical signal, classical signal processing methods are not directly applicable Increasingly high dimensionality of genetic data sets and the complexity involved call for fast and high throughput implementations of genomic signal processing algorithms Future work: spectral analysis of DNA sequence and data clustering of microarray data. Modify classical signal processing methods, and develop new ones.