Application of Unstructured Learning in Computational Biology Tony C Smith Department of Computer Science University of Waikato

Slides:



Advertisements
Similar presentations
College of Natural Sciences University of Northern Iowa Welcome to the Computer Science Department Dr. Ben Schafer.
Advertisements

INTRODUCTION TO MACHINE LEARNING David Kauchak CS 451 – Fall 2013.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Bioinformatics “Other techniques raise more questions than they answer. Bioinformatics is what answers the questions those techniques generate.” SheAvery
CSE 591 (99689) Application of AI to molecular Biology (5:15 – 6: 30 PM, PSA 309) Instructor: Chitta Baral Office hours: Tuesday 2 to 5 PM.
JYC: CSM17 BioinformaticsCSM17 Week 10: Summary, Conclusions, The Future.....? Bioinformatics is –the study of living systems –with respect to representation,
Building an Intelligent Web: Theory and Practice Pawan Lingras Saint Mary’s University Rajendra Akerkar American University of Armenia and SIBER, India.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
1 CIS607, Fall 2006 Semantic Information Integration Instructor: Dejing Dou Week 10 (Nov. 29)
1 CIS607, Fall 2004 Semantic Information Integration Attendees: Vikash Agarwal, Julian M Catchen Kevin A Huck, Kushal M Koolwal, Paea J Le Pendu Xiangkui.
Introduction to Bioinformatics (Lecture for CS498-CXZ Algorithms in Bioinformatics) Aug. 25, 2005 ChengXiang Zhai Department of Computer Science University.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Structural Bioinformatics Dr. Avraham Samson Course no.: Credit points: 1.5 Final grade is based on 10 assignments Course homepage:
Algorithms in Computational Biology Tanya Berger-Wolf Compbio.cs.uic.edu/~tanya/teaching/CompBio January 13, 2006.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
0 Unstructured Machine Learning: Providing the link between Genetic Data and Published Research Dr Tony C Smith Reel Two, Inc. 9 Hartley Street Hamilton,
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Medline Text Searching Tools – a Comparison Experiment McDermott Center for Human Growth and Development Center for Biomedical Inventions.
Bioinformatics.
What is Biotechnology?.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Incremental Learning Chris Mesterharm Fordham University.
Computing and Communications and Biology Molecular Communication; Biological Communications Technology Workshop Arlington, VA 20 February 2008 Jeannette.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Introduction to Bioinformatics (Lecture for CS397-CXZ Algorithms in Bioinformatics) Jan. 21, 2004 ChengXiang Zhai Department of Computer Science University.
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
Bioinformatics The Prediction of Life Tony C Smith Department of Computer Science University of Waikato
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
Bioinformatics The Prediction of Life Tony C Smith Department of Computer Science University of Waikato
1 What is Data Mining? l Data mining is the process of automatically discovering useful information in large data repositories. l There are many other.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
Information Technology in the Natural Sciences Biology – Chemistry – Physics.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Bioinformatics lectures at Rice University Li Zhang Lecture 11: Networks and integrative genomic analysis-3 Genomic data
Bioinformatics and Computational Biology
COMPUTATIONAL BIOLOGIST DR. MARTIN TOMPA Place of Employment: University of Washington Type of Work: Develops computer programs and algorithms to identify.
Opportunities for Text Mining in Bioinformatics (CS591-CXZ Text Data Mining Seminar) Dec. 8, 2004 ChengXiang Zhai Department of Computer Science University.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Compression of Protein Sequences EE-591 Information Theory FEI NAN, SUMIT SHARMA May 3, 2003.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
The study was requested of the NRC’s Board on Life Sciences by NSF, NIH, and DOE To examine the current state of biological research in the U.S. and recommend.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Bioinformatics bits of Life Dr. Tony C Smith Department of Computer Science University of Waikato
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
BME435 BIOINFORMATICS.
Biological Databases By: Komal Arora.
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
School of Computer Science & Engineering
Investigating Diversity Part 2
SMA5422: Special Topics in Biotechnology
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
What is Pattern Recognition?
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Part I. Introduction and Genetic Engineering
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Machine Learning overview Chapter 18, 21
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Application of Unstructured Learning in Computational Biology Tony C Smith Department of Computer Science University of Waikato

Unstructured learning in computational biology Tony C Smith Computability Before computers were built, mathematicians knew what they could do arithmetic (e.g. missile trajectories) arithmetic (e.g. missile trajectories) search (e.g. keys for secret codes) search (e.g. keys for secret codes) sort (census information) sort (census information) … anything with a mathematical algorithm … anything with a mathematical algorithm

Unstructured learning in computational biology Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do expert

Unstructured learning in computational biology Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do expert system expert

Unstructured learning in computational biology Tony C Smith Artificial Intelligence Computers do things only human brains can otherwise do learning system expert system

Unstructured learning in computational biology Tony C Smith Machine learning creating computer programs that get better with experience learn how to make expert judgments discover previously hidden, potentially useful information (data mining) What is machine learning? How does it work? user provides learning system with examples of concept to be learned induction algorithm infers a characteristic model of the examples model is used to predict whether or not future novel instances are also examples – and it does this very consistently, and very, very quickly!

Unstructured learning in computational biology Tony C Smith WeightDamageDirtFirmnessQuality heavyhighmildhardpoor heavyhighmildsoftpoor normalhighmildhardgood lightmediummildhardgood Lightclearcleanhardgood normalclearcleansoftpoor heavymediummildhardpoor... Mushroom Data weight good dirt firmness poor heavy light normal mildclean hardsoft poor good good Structured learning

Unstructured learning in computational biology Tony C Smith Unstructured learning data does not have fixed fields with specific values examples: images, continuous signals, expression data, text learning proceeds by correlating the presence or absence of any and all salient attributes Document Classification given examples of documents covering some topic, learn a semantic model that can recognize whether or not other documents are relevant prioritize them: i.e. quantify “how relevant” documents are to the topic not limited to keywords (nor is it misled by them) adapt to the user’s needs (ephemeral or long-term)

Unstructured learning in computational biology Tony C Smith Document classification demo

Unstructured learning in computational biology Tony C Smith bioinformatics Finding genes Determining gene roles Determining protein functions Empirical tests Sequence similarity comparison Literature

Unstructured learning in computational biology Tony C Smith GO-KDS demo

Unstructured learning in computational biology Tony C Smith Amide group Carboxyl group R group Amino Acid

Unstructured learning in computational biology Tony C Smith Amino Acid glycine tyrosine

Unstructured learning in computational biology Tony C Smith DNA encodes amino acids

Unstructured learning in computational biology Tony C Smith

Rasmol demo

Unstructured learning in computational biology Tony C Smith Biotechnology Biologists know proteins, computer scientists know machine learning Together, they can find out a lot of hidden information about genes and proteins Biotechnology is a multi-billion dollar industry Biotechnology is one of the best funded areas of scientific research