EE150a – Genomic Signal and Information Processing Seminar series –lectures on first 3 meetings, followed by students presentations –statistical signal.

Slides:



Advertisements
Similar presentations
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Advertisements

1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
Microarrays Dr Peter Smooker,
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bacterial Physiology (Micr430)
Molecular Biology Background. Schematic view of DNA organization in a cell.
Information Aspects of Nucleic Acids Measurement Technologies Description of nucleic acid measurement technologies Algorithmic, optimization, data analysis.
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Analysis of microarray data
with an emphasis on DNA microarrays
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
1 EE381V: Genomic Signal Processing Lecture #13. 2 The Course So Far Gene finding DNA Genome assembly Regulatory motif discovery Comparative genomics.
CSE 6406: Bioinformatics Algorithms. Course Outline
DNA microarrays Each spot contains a picomole of a DNA ( moles) sequence.
DNA MICROARRAYS WHAT ARE THEY? BEFORE WE ANSWER THAT FIRST TAKE 1 MIN TO WRITE DOWN WHAT YOU KNOW ABOUT GENE EXPRESSION THEN SHARE YOUR THOUGHTS IN GROUPS.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Microbial Genetics: DNA Replication Gene Expression
Data Type 1: Microarrays
Intelligent Systems for Bioinformatics Michael J. Watts
Gene expression and DNA microarrays Old methods. New methods based on genome sequence. –DNA Microarrays Reading assignment - handout –Chapter ,
Microarray Technology
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Finish up array applications Move on to proteomics Protein microarrays.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
CISC841, F08, Lec2, Liao CISC 841 Bioinformatics (Fall 2008) A Primer on Molecular Biology & Bioinformatics challenges.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
MCB 317 Genetics and Genomics Topic 11 Genomics. Readings Genomics: Hartwell Chapter 10 of full textbook; chapter 6 of the abbreviated textbook.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Overview of Bioinformatics 1 Module Denis Manley..
A Guide to the Natural World David Krogh © 2011 Pearson Education, Inc. Chapter 13 Lecture Outline Passing on Life’s Information: DNA Structure and Replication.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Central dogma: the story of life RNA DNA Protein.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Overview of Bioinformatics Module Denis Manley.. Contact Details Lecturer Name: Denis Manley Room number: KE-1-013a
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
High-throughput omic datasets and clustering
Proteome and Gene Expression Analysis Chapter 15 & 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
Motif Search and RNA Structure Prediction Lesson 9.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
DNA Structure. Essential Questions for Today What is DNA? What is a gene? What is the basic structure of DNA? What is the function of DNA?
The State of Microarrays The Scientist: 2003 By: Hien Dang.
Introduction to Oligonucleotide Microarray Technology
Unit 1 – Living Cells.  The study of the human genome  - involves sequencing DNA nucleotides  - and relating this to gene functions  In 2003, the.
Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
Microarray: An Introduction
AN INTRODUCTION TO GENE EXPRESSION ANALYSIS BY MICROARRAY TECHNIQUE (PART I) DR. AYAT B. AL-GHAFARI MONDAY 3 RD MUHARAM 1436.
1 From Bi 150 Lecture 0 October 4, 2012 An introduction to molecular biology... but you will learn the cell biology in this course.
Introduction to molecular biology Data Mining Techniques.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
What is DNA chip? Micro-Array containing all the genes (roughly 40,000) in the entire Human Genome (complete Genetic Code). Each known gene or “probe”
Molecular Genetics Transcription & Translation
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
Genetics: From Genes to Genomes
The Study of Biological Information
Molecular Biology of the Gene
Presentation transcript:

EE150a – Genomic Signal and Information Processing Seminar series –lectures on first 3 meetings, followed by students presentations –statistical signal processing basics –background reading for each meeting Location: Moore 080 (except today) List of papers with links: –minor modifications of the list are likely Contact: Haris Vikalo, Moore 125 –Phone: –

Occasionally check website for updates and increasing list of research related links Today’s handouts: –basic course info and a list of papers –R. Karp’s “Mathematical Challenges from Genomics and Molecular Biology” –sign-up sheet Next time: Prof. Vaidyanathan’s lecture on “Signal Processing Problems in Genomics” In two weeks: lecture on DNA microarray technology and novel estimation techniques of gene expression levels Today: introduction with brief overview of the topics for presentation

Central Dogma of Molecular Biology Flow of information in a cell: [Due to Francis Crick. It has recently been realized that the dogma requires modifications but more about that later in course.] Recent development of high-throughput technologies that study the above flow –requires interdisciplinary effort –dealing with a huge amount of information

genetics.gsk.com/ graphics/dna-big.gif Four nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T) Bindings: –A with T (weaker), C with G (stronger) Forms a double helix – each strand is linked via sugar-phosphate bonds (strong), strands are linked via hydrogen bonds (weak) Genome is the part of DNA that encodes proteins: –…AACTCGCATCGAACTCTAAGTC… DNA Structure

Sidenote: Sequence Alignment Perhaps the most fundamental operation in bioinformatics –used to decide if two genes or proteins are related by function, structure, or evolutionary history –can identify patterns of conservation and variability Performs pairwise matching between characters of each sequence One place where it is useful: SNP (single-nucleotide polymorphism) detection –SNPs may indicate a disease development (myocardial diseases, arthritis, etc. have been associated with SNPs) Sequence alignment is the first student presentation topic in the series (HMM, dynamic programming, Bayesian methods)

Details of the information flow Replication of DNA –{A,C,G,T} to {A, C, G,T} Transcription of DNA to mRNA –{A,C,G,T} to {A, C, G,U} Translation of mRNA to proteins –{A,C,G,U} to {20 amino-acids}

Genes can be turned on and off

Microarray Technology A medium for matching known and unknown DNA samples based on hybridization (base-pairing) Two major applications –identification of a sequence (gene or gene mutation) –determination of expression level (abundance) of genes Enables massively parallel gene expression studies Two types of molecules take part in the experiments: –probes, orderly arranged on an array –targets, the unknown samples to be detected

“Traditionally”, there are two formats: –probe cDNA immobilized to a solid surface using robot spotting and exposed to a set of targets, and –an array of oligonucleotide probes synthesized on chip (via, e.g., photolithography) Targets are typically fluorescently labeled cDNA molecules obtained from mRNA samples –hybridize to their complementary probes –image readout Types of Microarrays

Illustration: DNA microarray

Sample Microarray Readout

Some Design Issues Hybridization is binding of a target to its perfect complement However, when a probe differs from a target by a small number of bases, it still may bind This non-specific binding (cross-hybridization) is a source of measurement noise In special cases (e.g., arrays for gene detection), designer has a lot of control over the landscape of the probes on the array Second topic for presentations considers a combinatorial design of such arrays [How to deal with cross-hybridization on arrays used for expression level measurements is the topic of the third lecture.]

Clustering Gene Expression Profiles Microarrays measure expression levels of thousands of gene simultaneously For instance, we might take samples at different times during a biological process Cluster data in the expression level space –relatedness in biological function often implies similarity in expression behavior (and vice versa) –similar expression behavior indicates co-expression Clustering of expression level data is one of the topics (traditional statistical methods but also graph-theoretic approach, information-theoretic approach, etc.)

Example of Clustering Rows: various gene expression levels Columns: Time progression So-called hierarchical clustering

Co-regulated genes Co-expressed genes may be co-regulated –a combination of transcription factors (activating or repressing proteins) regulates genes jointly Finding binding sites (control regions) of co-regulated genes is another topic HMM, probabilistic methods (EM, Gibbs sampling)

Genetic Regulatory Networks Proteins take part in the gene regulation –feedback loop in the Central Dogma information flow Thus to fully understand gene regulation, we need to consider interactions –DNA, RNA, proteins, small molecules Requires network formalism –directed graphs, Boolean networks, Bayesian networks, differential equations etc. Explore some of these models in gene regulation context

An Illustration of a Regulatory Network

Protein Translation/Folding [Should time permit.] Sequence-structure relationship will play very important role in the postgenomic era –potential great impact on genetics and pharmaceutical chemistry, protein design –diseases such as Alzheimer’s are believed to be related to protein misfolding Computationally very hard –parallel, distributed computing

Genomic data fusion Consider the problem of classification of a protein and assume that we know: –original gene sequence encoding the protein –gene expression levels –some of the protein-protein interactions Question: how to combine various types of data to classify the protein The last (right now…) topic of the seminar will be data fusion of the various genomic data listed above –efficient convex optimization based statistical learning algorithm

Summary Trying to understand gene regulation Recent technologies revolutionized research –huge amount of data Multidisciplinary; identify opportunities Challenging problems, quite important: –understanding information processes on genetic level gives insights about phenotypic effects (disease) –some of the ultimate goals are molecular diagnostics and creating personalized drugs