The BIG Goal “The greatest challenge, however, is analytical. … Deeper biological insight is likely to emerge from examining datasets with scores of samples.”

Slides:



Advertisements
Similar presentations
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Advertisements

Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
A Lite Introduction to (Bioinformatics and) Comparative Genomics Chris Mueller August 10, 2004.
Bioinformatics What is bioinformatics? Why bioinformatics? The major molecular biology facts Brief history of bioinformatics Typical problems of bioinformatics:
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Bioinformatics and Phylogenetic Analysis
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
The Cell, Central Dogma and Human Genome Project.
prepared with some help from friends...
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Fine Structure and Analysis of Eukaryotic Genes
Lesson 10 Bioinformatics
Bioinformatics.
Development of Bioinformatics and its application on Biotechnology
CSE 6406: Bioinformatics Algorithms. Course Outline
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Bioinformatics Stuart M. Brown, Ph.D. NYU School of Medicine.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Bioinformatics Lecture to accompany BLAST/ORF finder activity Start with orientation to activity, for taking notes effectively Slide difference between.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Copyright © 2010 Pearson Education Inc. Lecture 01 – Genetics & Genomics: An Introduction Based on Chapter 1 – Genetics: An introduction.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Organizing information in the post-genomic era The rise of bioinformatics.
 The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding.
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
Genomics for Librarians Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine.
+ => Bioinformatics: from Sequence to Knowledge Outline: Introduction to bioinformatics The TAU Bioinformatics unit Useful bioinformatics issues and databases:
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
By Melissa Rivera.  GENE CLONING: production of multiple identical copies of DNA  It was developed so scientists could work directly with specific genes.
Genomics and Forensics
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
EB3233 Bioinformatics Introduction to Bioinformatics.
Biotechnology and Genomics Chapter 16. Biotechnology and Genomics 2Outline DNA Cloning  Recombinant DNA Technology ­Restriction Enzyme ­DNA Ligase 
Bioinformatics and Computational Biology
Bioinformatics Lecture to accompany BLAST/ORF finder activity
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Notes: Human Genome (Right side page)
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Bioinformatics Overview
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
생물정보학 Bioinformatics.
Genomes and Their Evolution
Mangaldai College, Mangaldai
Genomes and Their Evolution
Genome organization and Bioinformatics
Introduction to Bioinformatics II
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Bioinformatics Vicki & Joe.
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Introduction to Bioinformatic
Biology, 9th ed,Sylvia Mader
Introduction to Bioinformatics
Presentation transcript:

The BIG Goal “The greatest challenge, however, is analytical. … Deeper biological insight is likely to emerge from examining datasets with scores of samples.” Eric Lander, “array of hope” Nat. Gen. volume 21 supplement pp 3 - 4, Bio-informatics: Provide methodologies for elucidating biological knowledge from biological data.

Genetic Information Central Paradigm of Bio-informatics

Molecular Structure Genetic Information Central Paradigm of Bio-informatics

Molecular Structure Genetic Information Biochemical Function Central Paradigm of BioInformatics

Molecular Structure Genetic Information Biochemical Function Symptoms Central Paradigm of Bio-informatics

Molecular Structure Genetic Information Biochemical Function Symptoms Central Paradigm of Bio-informatics

Computer Science Tools are Crucial

New bio-technologies create huge amounts of data. It is impossible to analyze data by manual inspection. Novel mathematical, statistical, algorithmic and computational tools are necessary !

Automated Sequencing

What is Bio-Informatics ? A field of science in which Biology, Computer Science and Information Technology merge into a single discipline. Computers (& software tools) are used to collect, analyze and interpret biological information at the molecular level. Goal: To enable the discovery of new biological insights and create a global perspective for biologists.

Development of new algorithms and statistical methods to assess relationships among members of large data sets. Analysis and interpretation of various types of data. Development and implementation of tools to efficiently access and manage different types of information. Disciplines

Why Use Bio-Informatics ? An explosive growth in the amount of biological information necessitates the use of computers for cataloging and retrieval of data (> 3 billion bps, > 30,000 genes). The human genome project. Automated sequencing. GenBank has over 16 Billion bases and is doubling every year !!!

New Types of Biological Data Micro arrays - gene expression. Multi-level maps: genetic, physical: sequence, annotation. Networks of protein-protein interactions. Cross-species relationships: Homologous genes. Chromosome organization. ‭ html

A more global view of experimental design. (from “one scientist = one gene/protein/disease” paradigm to whole organism consideration). Data mining - functional/structural information is important for studying the molecular basis of diseases, diagnostics, developing drugs (personal medicine), evolutionary patterns, etc. Why Bio Informatics ? (cont.)

Why Bio Informatics ? (cont.)

Principle milestones in data mining and genome analysis: Sanger method for sequencing, invented in 1977 (winner of the Nobel Prize in 1980), Polymerase chain reaction (PCR), invented in 1989 (awarded the Nobel Prize in 1993). Future of Genomic Research

The next step: Locate all the genes and understand their function. This will probably take another years !

Disease Genes Discovered

One can efficiently find information: Using databases and software on the web. Question: How likely are you to use a free bio-informatics library of accessible software ? The job of biologists is changing…

Molecular Biology Analysis Software Tools - Freely Available on the Web. - Highlights

Broad Classification of Biological Databases

ENTREZ - PubMed NCBI

Genome Proteome Transcriptome Gene function Metabolome Glycome 89,300 1,701 Google search PubMed 2.1x ,566 9, x x10 5 1, Post-genomic terms (Oct. 2002) PubMed Hits Proteome From: Computational Proteomics, Mark B Gerstein, Yale U.

Similarity / Analogy Examples: If looks like an elephant, and smells like an elephant– it’s an elephant. If walks like a duck, and quacks like a duck– it’s a duck.

Similarity Search in Databanks Find similar sequences to a working draft. As databanks grow, homologies get harder, and quality is reduced. Alignment Tools: BLAST & FASTA (time saving heuristics- approximations). >gb|BE |BE BARC 5BOV Bos taurus cDNA 5'. Length = 369 Score = 272 bits (137), Expect = 4e-71 Identities = 258/297 (86%), Gaps = 1/297 (0%) Strand = Plus / Plus Query: 17 aggatccaacgtcgctccagctgctcttgacgactccacagataccccgaagccatggca 76 |||||||||||||||| | ||| | ||| || ||| | |||| ||||| ||||||||| Sbjct: 1 aggatccaacgtcgctgcggctacccttaaccact-cgcagaccccccgcagccatggcc 59 Query: 77 agcaagggcttgcaggacctgaagcaacaggtggaggggaccgcccaggaagccgtgtca 136 |||||||||||||||||||||||| | || ||||||||| | ||||||||||| ||| || Sbjct: 60 agcaagggcttgcaggacctgaagaagcaagtggagggggcggcccaggaagcggtgaca 119 Query: 137 gcggccggagcggcagctcagcaagtggtggaccaggccacagaggcggggcagaaagcc 196 |||||||| | || | ||||||||||||||| ||||||||||| || |||||||||||| Sbjct: 120 tcggccggaacagcggttcagcaagtggtggatcaggccacagaagcagggcagaaagcc 179 Query: 197 atggaccagctggccaagaccacccaggaaaccatcgacaagactgctaaccaggcctct 256 ||||||||| | |||||||| |||||||||||||||||| |||||||||||||||||||| Sbjct: 180 atggaccaggttgccaagactacccaggaaaccatcgaccagactgctaaccaggcctct 239 Query: 257 gacaccttctctgggattgggaaaaaattcggcctcctgaaatgacagcagggagac 313 || || ||||| || ||||||||||| | |||||||||||||||||| |||||||| Sbjct: 240 gagactttctcgggttttgggaaaaaacttggcctcctgaaatgacagaagggagac 296 Pairwise alignment:

Multiple Sequence Alignment Multiple alignment: find protein families and functional domains.

Structure - Function Relationships structure function sequence

Protein Structure (domains)

Phylogeny Evolution - a process in which small changes occur within species over time. These changes could be monitored today using molecular techniques. The Tree of Life: A classical, basic science problem, since Darwin’s 1859 “Origin of Species”.

Origin of the universe ? Formation of the solar system First self replicating systems Prokaryotes/ eukaryotes Plant/ animals Invertebrates/ vertebrates Mammalian radiation Tree of Life: Searching Protein Sequence Databases - How far can we see back ?

Write down all of human DNA on a single CD (“completed” 2001). Identify all genes, their location and function (far from completion). The Human Genome Project (HGP)

Example for Gene Localization Bio-Tool (FISH).

Fluorescent labeled probes hybridize to specific chromosomal locations. Example application: low resolution localization of a gene. FISH - Fluorescence In-Situ Hybridization.

Sequencing Genes & Gene Assembly Automated sequencing

Gene Finding Only 2-3% of the human genome encodes for functional genes. Genes are found along large non-coding DNA regions. Repeats, pseudo-genes, introns, contamination of vectors, are very confusing.

Gene Finding - cont. Find special gene patterns: Translation start and stop sites (open reading frames - ORF). Transcription factors, promoters. Intron splice sites. Etc…

Micro Arrays (“DNA Chips”) New biotechnology breakthrough: measure RNA expression levels of thousands of genes (in one experiment).

The Idea Behind Micro Arrays

Clustering Analysis of Gene Expression Data DNA chips and personalized medicine (leading edge, future technologies).

Pharmaco-genomics Use DNA information to measure and predict the reaction to drugs. Personalized medicine. Faster clinical trials: selected populations. Less drug side-effects.

Protein and Other Arrays Sequencing the human genome => finite problem. Studying the proteome => endless possible variations, dynamic. Protein array Future fields of study: Proteins + Genomics = Proteomics Lipids + Genomics = Lipomics Sugars + Genomics = Glycomics

Understanding Mechanisms of Disease EC number compound

SEQUENCE ALIGNMENT ORTHOLOG GENES (Taxonomy) CONSERVED DOMAINS CODING REGIONS 3-D STRUCTURE GENE FAMILIES MUTATIONS & POLYMORPHISM GENOME MAPS CELLULAR LOCATION SIGNAL PEPTIDE Putting it all together: Bio-Informatics SEQUENCES & LITERATURE

GENE EXPRESSION, GENES FUNCTION, DRUG & PERSONAL THERAPY CODING REGIONS SEQUENCE ALIGNMENT ORTHOLOG GENES (Taxonomy) CONSERVED DOMAINS GENE FAMILIES MUTATIONS & POLYMORPHISM GENOME MAPS CELLULAR LOCATION SIGNAL PEPTIDE 3-D STRUCTURE Putting it all together: Bio-Informatics