The Impact of Computer Technology in Molecular Biology and Genetics

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

Jeff Shen, Morgan Kearse, Jeff Shi, Yang Ding, & Owen Astrachan Genome Revolution Focus 2007, Duke University, Durham, North Carolina Introduction.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
Richard, Rochelle, Zohal, Angie
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Chapter 9: Biotechnology
Sequencing a genome and Basic Sequence Alignment
Biotechnology and Genomics Chapter 16. Biotechnology and Genomics 2Outline DNA Cloning  Recombinant DNA Technology ­Restriction Enzyme ­DNA Ligase 
Lesson 10 Bioinformatics
Genetic Engineering Do you want a footer?.
Development of Bioinformatics and its application on Biotechnology
Lesson Overview Lesson Overview Studying the Human Genome Lesson Overview 14.3 Studying the Human Genome.
Section 2 Genetics and Biotechnology DNA Technology
Hugh E. Williams and Justin Zobel IEEE Transactions on knowledge and data engineering Vol. 14, No. 1, January/February 2002 Presented by Jitimon Keinduangjun.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
Genomes and Genomics.
Sequencing a genome and Basic Sequence Alignment
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
Biotechnology and Genomics Chapter 16. Biotechnology and Genomics 2Outline DNA Cloning  Recombinant DNA Technology ­Restriction Enzyme ­DNA Ligase 
Bioinformatics and Computational Biology
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
All rights Reserved Cengage/NGL/South-Western © 2016.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
KEY CONCEPT DNA sequences of organisms can be changed.
CAMPBELL BIOLOGY Reece Urry Cain Wasserman Minorsky Jackson © 2014 Pearson Education, Inc. TENTH EDITION CAMPBELL BIOLOGY Reece Urry Cain Wasserman Minorsky.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Biotechnology.
Part 3 Gene Technology & Medicine
Bioethics Writing Assignment
Research Paper on BioInformatics
Chapter 9: Biotechnology
All rights Reserved Cengage/NGL/South-Western © 2016.
The Impact of Computer Technology in Molecular Biology and Genetics
Data-intensive Computing: Case Study Area 1: Bioinformatics
Genes (3.1) Essential Idea: Heritable traits are passed down to offspring through genes.
All rights Reserved Cengage/NGL/South-Western © 2016.
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
Section 3: Gene Technologies in Detail
Human Cells Human genomics
Section 2 Genetics and Biotechnology DNA Technology
Genomes and Their Evolution
What is Bioinformatics?
Bellwork: What is the human genome project. What was its purpose
Jeopardy! Molecular Genetics Edition.
3.1 Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and hence genetic information is inherited from parents,
Biology, 9th ed,Sylvia Mader
Scientists use several techniques to manipulate DNA.
Genome Center of Wisconsin, UW-Madison
14-3 Human Molecular Genetics
Genomes and Their Evolution
4/26/2010 BIOTECHNOLOGY.
3.1 Genes Genes and hence genetic information is inherited from parents, but the combination of genes inherited from parents by each offspring will be.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
DNA and the Genome Key Area 8a Genomic Sequencing.
Bioinformatics Vicki & Joe.
3.1 Genes Genes and hence genetic information is inherited from parents, but the combination of genes inherited from parents by each offspring will be.
3.1 Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and hence genetic information is inherited from.
Basic Local Alignment Search Tool (BLAST)
BSC1010: Intro to Biology I K. Maltz Chapter 21.
Biology, 9th ed,Sylvia Mader
From Mendel to Genomics
The Indispensable Forensic Tool
Biotechnology.
Basic Local Alignment Search Tool
Presentation transcript:

The Impact of Computer Technology in Molecular Biology and Genetics By Abtahi TISHAd

Why this topic? I am a pre-medical student I am a research assistant at the UNC School of Medicine (Neuroscience) PCR DNA Sequencing Cryostat-Sectioning I was mesmerized by the immense role of technology in data collection, storage, and analysis in the scientific world

Overview: Computer Technology in Molecular Biology and Genetics Topics to Cover Sequencing DNA Genomic Databases BLAST + Basic Algorithm Identifying Genes Significance of Genomic Analysis and Future Prospects

Sequencing DNA What is DNA sequencing? It is taking a sample of DNA and unraveling its nucleotide base sequences to be stored or analyzed Advances in technology has made DNA sequencing faster, cheaper, and more efficient than ever before. Cost of the Human Genome Project was $100 million is 2001 It was $1,245 in 2015 (“DNA Sequencing”, 2016)

How DNA Sequencing Works- Sanger/Original Method DNA segment broken into many fragments and extended by identical primers Extension of DNA ends at selectively marked nucleotides, creating many different length strands Smallest strand = 1st nucleotide, 2nd smallest = 2nd nucleotide…largest = last nucleotide Final nucleotide of each strand is marked with a specific color coded dye to be recognized by device Entire process is automated by the machine G T C A

Next Generation Sequencing Better technological innovation has led to methods which can sequence DNA faster by many magnitudes Use of fluorescence imaging techniques to capture nucleotide sequences at a rate that can be 200 times faster than the original method! Again 100% automated, but a lot faster Many techniques are essentially the Sanger method, but even more compartmentalized for efficiency

Storing the Data: The GenBank The massive amounts of information gathered from DNA sequencing must be stored What is the GenBank? It is a very large database of DNA sequences (Klug, Cummings, Spencer Palladino, 2013) Run by the National Center for Biotechnology Information(NCBI) It contains more than 100 billion base pair sequences of DNA, and it doubles in size every 14 to 18 months (Klug, Cummings, Spencer Palladino, 2013)

GenBank (cont.) Are the DNA sequences just stored? No they are identified for the genes they code for and that sequence is assigned an accession number This number is unique to a particular sequence Used to locate the sequence from the large database Connection to class: The use of algorithms Input: Accession Number Output: Target DNA sequence/gene of interest

DNA vs. Genes DNA is the vector in which genetic material is stored. Genes are specific DNA sequences which code for proteins Protein expression is what causes phenotypic traits (hair color, skin color, etc…) Only 2% of human DNA consists of genes (Klug, Cummings, Spencer Palladino, 2013)

How are the Genes Identified? Gene identification from DNA sequences is called “annotation” Annotation heavily relies on computer technology to ensure DNA sequences are identified quickly, efficient, and correctly. BioInformatics

What is BioInformatics? It is the use of computer hardware and software and mathematics to analyze large quantities of genetic information. Ex. The GenBank, BLAST A mesh of biology, computer science, and mathematics

Basic Local Alignment Search Tool: BLAST What is BLAST? It is a computer software which allows for comparison between DNA sequences in search of similarities. Why is this Useful? Unknown DNA sequences can be identified by comparing them to already known sequences in different species Ex. Using known DNA sequence in rats for Gene X to find a similar Gene Y in mice. The sequences should have a high similarity

Before Explaining BLAST: How DNA Works DNA consists of repeating patterns of nucleotides with 4 distinct nitrogenous bases: adenine, thymine, cytosine, and thymine The order and number of nitrogenous bases are what differentiates the DNA of different organisms Closely related organisms have very similar DNA sequences, and may even contain some of the same genes

How BLAST Works BLAST aligns two different DNA sequences and produces an identity value The identity value is a measure of similarity between the two sequences (Klug, Cummings, Spencer Palladino, 2013) Example: If DNA Sequence X in mice contains the insulin gene, and DNA sequence Y in rats is 99% similar to DNA Sequence X, then Y must contain the insulin gene, or a similar equivalent in rats Thus, using BLAST we just identified the insulin gene in rats from known data about mice BLAST essentially uses the same mechanism as a primitive search engine

The Basic Algorithm of BLAST BLAST works similar to most search engines Users have to input a DNA sequence, and/or the mRNA transcript of a specific gene within an organism The software contains an index of information (aka the GenBank database) It reproduces results from the index to the user based on how similar they are to the initial search query The algorithm of the search can be changed by the user based on what they are looking for in particular

BLAST Algorithm in CS-Speak Removal of low complexity regions or repeats in query sequence Prevents false sense of high similarity Make a k letter word list of the query sequence i.e. 3 letters Speeds up the search process/compartmentalization Scan database for matching words Assign numerical values for results; i.e perfect match = 7, mismatch = -4 Process is repeated for every word in query The matches are compiled together for statistical analysis based on matching score

BLAST Algorithm (Cont.) 6. The matched sequences (HSP) which pass the statistical significance test are aligned together if they are continuous 7. These sequences are subsequently further analyzed and reported http://onlinelibrary.wiley.com/doi/10.1002/9780470015902.a0005253.pub2/full

Demonstration Human actin Gene mRNA sequence Link 1: https://www.ncbi.nlm.nih.gov/gene Link 2: https://blast.ncbi.nlm.nih.gov/Blast.cgi

Potential Problems with the “old” BLAST Previous versions of BLAST did not account for Introns: non-coding sequences of DNA Location of promoters/enhancers/silencers These are all needed to code a gene into a protein Two genes could contain similar nucleotide sequences, but code for different proteins Plus, BLAST requires you to have a template/already identified gene to compare the target gene to

More Complex Organisms Require More Innovative Algorithms for Analysis There are simply more factors that need to be taken into account when study a complex organism’s DNA such as a human, than when studying a less complex organism like a bacteria How is this solved? Many annotation programs have adopted newer algorithms that incorporate many of the factors previously mentioned (Klug, Cummings, Spencer Palladino, 2013) Introns, promoters, enhancers/silencers, termination regions…etc BLAST now provides information on mRNA and protein sequences of various genes

Lets Summarize: Computer Technology is Involved in Every Aspect of Genomic Analysis DNA sequencing machines: Sequence DNA Online Databases: Store DNA sequences Computer Softwares: Analyze DNA sequences Without computers, much of today’s knowledge in genetics would not be possible

Significance of Genomic Analysis Identifying evolutionary relationship between different organisms BLAST comparisons Basic gene to protein relationships Genomic analysis found that there are significantly more protein than there are genes that code for them Identifying the origin of genetic diseases and ways to cure/prevent them Availability of GMOs Green Revolution in India saved the lives of millions of people through high yielding cereal crops

Development of New Areas of Research Due to Tech Advances Nutrigenomics- The study of how diet affects the genes Allows scientists to hypothetically give subjects healthy dietary recommendations based on the analysis of their genes (Klug, Cummings, Spencer Palladino, 2013) Gene analysis is done through the help of technology to locate specific target sequences pertaining to health Stone-Age Genomics – The study of ancient DNA from primarily extinct animals (Klug, Cummings, Spencer Palladino, 2013) Allows for better understanding of the lives of organisms from the past, and similarities they have to organisms of today

The Future: The Encyclopedia of DNA Elements Project (ENCODE) Successor of the Human Genome Project The goal is to identify all promoters, enhancers, silencers, and open reading frames within human genome (Klug, Cummings, Spencer Palladino, 2013)

The Future: Personalized Genome Projects The low cost of DNA sequencing due to newer technology has made it a possibility for individuals to get a read out of their own genome (Klug, Cummings, Spencer Palladino, 2013) Can lead to personalized medicine and better treatment of diseases

Potential Concerns: Genetic Discrimination Technology has enabled us to learn more about our genome than ever before, but this can lead to potential concerns such as genetic discrimination. Current laws do not protect workers within small businesses, nor anyone within the U.S. military (“Genetic Discrimination”, 2016) Hacking of sensitive genetic information With greater progress comes greater responsibility

Works Cited DNA Sequencing. (n.d.). Retrieved March 30, 2017, from https://www.khanacademy.org/science/biology/biotech-dna- technology/dna-sequencing-pcr-electrophoresis/a/dna-sequencing Genetic Discrimination. (2016, May 2). Retrieved March 30, 2017, from https://www.genome.gov/10002077/ Klug, W., Cummings, M., Spencer, C., & Palladino, M. (2013). Chapters 18.3-18.5. In Essentials of Genetics (8th ed., pp. 362-372). Boston, MA: Pearson. Altschul, S. F. (2014). BLAST Algorithm. ELS. doi:10.1002/9780470015902.a0005253.pub2