Genomics for Librarians Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine.

Slides:



Advertisements
Similar presentations
Research Computing, NYU School of Medicine
Advertisements

LESSON 1: What is Genetic Research? PowerPoint slides to accompany Using Bioinformatics : Genetic Research.
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
A Systematic approach to the Large-Scale Analysis of Genotype- Phenotype correlations Paul Fisher Dr. Robert Stevens Prof. Andrew Brass.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Bioinformatics and the Engineering Library ASEE 2008 Amy Stout.
AP Biology Teaching Biology Through Bioinformatics Real world genomics research in your classroom Kim B. Foglia Division Ave. High School Levittown.
Let’s investigate some of the Hot Areas of Life Sciences in more detail: Genomics –Human Genome Project –Use of Microarrays or DNA chips Bioinformatics.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Computers and Programming for Biologists. What is Bioinformatics? The use of information technology to collect, analyze, and interpret biological data.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
The Human Genome Project and ~ 100 other genome projects:
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
The BIG Goal “The greatest challenge, however, is analytical. … Deeper biological insight is likely to emerge from examining datasets with scores of samples.”
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Arrays: Narrower terms include bead arrays, bead based arrays, bioarrays, bioelectronic arrays, cDNA arrays, cell arrays, DNA arrays, gene arrays, gene.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
Bioinformatics page 12, part of ch. 21 Cell and Mol Biol Lab.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Bioinformatics.
Bioinformatics and medicine: Are we meeting the challenge?
Bioinformatics Stuart M. Brown, Ph.D. NYU School of Medicine.
Master’s Degrees in Bioinformatics in Switzerland: Past, present and near future Patricia M. Palagi Swiss Institute of Bioinformatics.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Bioinformatics Brad Windle Ph# Web Site:
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Literature reviews revised is due4/11 (Friday) turn in together: revised paper (with bibliography) and peer review and 1st draft.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
Organizing information in the post-genomic era The rise of bioinformatics.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Overview of Bioinformatics 1 Module Denis Manley..
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Genomics.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Genes and Genomic Datasets. DNA compositional biases Base composition of genomes: E. coli: 25% A, 25% C, 25% G, 25% T P. falciparum (Malaria parasite):
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
EB3233 Bioinformatics Introduction to Bioinformatics.
Bioinformatics and Computational Biology
Bioinformatics Lecture to accompany BLAST/ORF finder activity
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
High throughput biology data management and data intensive computing drivers George Michaels.
Notes: Human Genome (Right side page)
Instructor Prof. Chandrama P. Upadhyaya 220, Life Sciences Building ,
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
Chapter 13 Section 13.3 The Human Genome. Genomes contain all the information needed for an organism to grow and survive The Human Genome Project (HGP)
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.
Selection of Resources for the Development of an Information Service Program in Molecular Biology and Genetics Ansuman Chattopadhyay, PhD Information Specialist.
Mangaldai College, Mangaldai
Genomes and Their Evolution
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Introduction to Bioinformatics
Presentation transcript:

Genomics for Librarians Stuart M. Brown, Ph.D. Director, Research Computing, NYU School of Medicine

A Genome Revolution in Biology and Medicine  We are in the midst of a "Golden Era" of biology  The Human Genome Project has produced a huge storehouse of data that will be used to change every aspect of biological research and medicine  The revolution is about treating biology as an information science, not about specific biochemical technologies.

The Human Genome Project

The job of the biologist is changing –The biologist will spend more time using computers & on experimental design and data analysis (and less time doing tedious lab biochemistry) –Biology will become a more quantitative science (think how the periodic table affected chemistry) As more biological information becomes available and laboratory equipment becomes more automated...

A review of some basic genetics

DNA  4 bases (G, C, T, A)  base pairs G--C T--A  genes  non-coding regions

Decoding Genes

What is Bioinformatics? The use of information technology to collect, analyze, and interpret biological data. An ad hoc collection of computing tools that are used by molecular biologists to manage research data. –Computational algorithms –Database schema –Statistical methods –Data visualization tools

Genomics  What is Genomics ? –An operational definition: The application of high throughput automated technologies to molecular biology. –A philosophical definition: A wholistic or systems approach to the study of information flow within a cell.

Genomics make LOTS of data!  Investigators need complex databases just to manage their own experiments  Biologists need to know how to do data mining to answer even simple questions in these huge data sets  Librarians understand the challenges of storage and searching of large amounts of data

New Biology => New Librarians? How do Genomics and Bioinformatics overlap or interact with Library Science? 1.The NCBI (Natl. Center for Biotechnology Information), the home of GenBank, is part of the National Library of Medicine 2.We store and organize genes like Journal articles - accession number, annotation, etc. 3.A big part of bioinformatics involves keyword searches and SQL queries in relational databases

Bioinformatics is Not Library Science  We are NOT cataloging a set of known information  Programming and complex algorithms - pattern matching, string matching, biostatistics  Data mining and multi-dimensional visualization tools  Uncertainty of the data and constant revision of the “known” –Genes are guesses based on complex algorithms, not books on the shelf

Raw Genome Data:

>gb|BE |BE BARC 5BOV Bos taurus cDNA 5'. Length = 369 Score = 272 bits (137), Expect = 4e-71 Identities = 258/297 (86%), Gaps = 1/297 (0%) Strand = Plus / Plus Query: 17 aggatccaacgtcgctccagctgctcttgacgactccacagataccccgaagccatggca 76 |||||||||||||||| | ||| | ||| || ||| | |||| ||||| ||||||||| Sbjct: 1 aggatccaacgtcgctgcggctacccttaaccact-cgcagaccccccgcagccatggcc 59 Query: 77 agcaagggcttgcaggacctgaagcaacaggtggaggggaccgcccaggaagccgtgtca 136 |||||||||||||||||||||||| | || ||||||||| | ||||||||||| ||| || Sbjct: 60 agcaagggcttgcaggacctgaagaagcaagtggagggggcggcccaggaagcggtgaca 119 Query: 137 gcggccggagcggcagctcagcaagtggtggaccaggccacagaggcggggcagaaagcc 196 |||||||| | || | ||||||||||||||| ||||||||||| || |||||||||||| Sbjct: 120 tcggccggaacagcggttcagcaagtggtggatcaggccacagaagcagggcagaaagcc 179 Query: 197 atggaccagctggccaagaccacccaggaaaccatcgacaagactgctaaccaggcctct 256 ||||||||| | |||||||| |||||||||||||||||| |||||||||||||||||||| Sbjct: 180 atggaccaggttgccaagactacccaggaaaccatcgaccagactgctaaccaggcctct 239 Query: 257 gacaccttctctgggattgggaaaaaattcggcctcctgaaatgacagcagggagac 313 || || ||||| || ||||||||||| | |||||||||||||||||| |||||||| Sbjct: 240 gagactttctcgggttttgggaaaaaacttggcctcctgaaatgacagaagggagac 296 BLAST Similarity Search

Multiple Alignment

Protein domains (Pattern analysis)

Clustering (Phylogenetics)

UCSC

The Challenge of New Data Types (Genomics) Gene expression microarrays –thousands of genes, imprecise measurements –huge images, private file formats Proteomics –high-throughput Mass Spec –protein chips: protein-protein interactions Genotyping –thousands of alleles, thousands of individuals Regulatory Networks

Biological Information

Microarray Technology

Spot your own Chip (plans available for free from Pat Brown’s website) Robot spotter Ordinary glass microscope slide

cDNA spotted microarrays

Goal of Microarray experiments  Microarrays are a very good way of identifying a bunch of genes involved in a disease process –Differences between cancer and normal tissue –Tuberculosis infected vs resistant lung cells  Mapping out a pathway –Co-regulated genes  Finding function for unknown genes –Involved these processes

Proteomics  Identify all of the proteins in an organism –Potentially many more than genes due to alternative splicing and post-translational modifications  Quantitate in different cell types and in response to metabolic/environmental factors  Protein-protein interactions

Yeast Proteome Jeong H, Mason SP, A.-L Barabasi Nature 411 (2001) 40-41

Human Genetic Variation  Every human has essentially the same set of genes  But there are different forms of each gene -- known as alleles –blue vs. brown eyes –genetic diseases such as cystic fibrosis or Huntington’s disease are caused by dysfunctional alleles

Alleles are created by mutations in the DNA sequence of one person - which are passed on to their descendants

High-Throughput Genotyping

Relate genes to Organisms  Diseases –OMIM: Human Genetic Disease  Metabolic and regulatory pathways –KEGG –Cancer Genome Project

Human Alleles  The OMIM (Online Mendelian Inheritance in Man) database at the NCBI tracks all human mutations with known phenotypes.  It contains a total of about 2,000 genetic diseases [and another ~11,000 genetic loci with known phenotypes - but not necessarily known gene sequences]  It is designed for use by physicians: –can search by disease name –contains summaries from clinical studies

Training "computer savvy" scientists u Know the right tool for the job u Get the job done with tools available u Network connection is the lifeline of the scientist u Jobs change, computers change, projects change, scientists need to be adaptable

Why teach genomics in undergraduate (or Medical) education?  Demand for trained graduates from the biomedical industry  Bioinformatics is essential to understand current developments in all fields of biology  We need to educate an entire new generation of scientists, health care workers, etc.  Use bioinformatics to enhance the teaching of other subjects: genetics, evolution, biochemistry

Genomics in Medical Education “The explosion of information about the new genetics will create a huge problem in health education. Most physicians in practice have had not a single hour of education in genetics and are going to be severely challenged to pick up this new technology and run with it." Francis Collins

Long Term Implications u A "periodic table for biology" will lead to an explosion of research and discoveries - we will finally have the tools to start making systematic analyses of biological processes (quantitative biology). u Understanding the genome will lead to the ability to change it - to modify the characteristics of organisms and people in a wide variety of ways

Stuart M. Brown, Ph.D. Bioinformatics: A Biologist's Guide to Biocomputing and the Internet Essentials of Medical Genomics