Download presentation
Presentation is loading. Please wait.
Published byDwain Farmer Modified over 9 years ago
1
Knowledge Discovery from Biological and Clinical Data: BASIC BACKGROUND
2
Jonathan’s rules: Blue or Circle Jessica’s rules: All the rest What is Datamining? Whose block is this? Jonathan’s blocks Jessica’s blocks
3
What is Datamining? Question: Can you explain how?
4
Knowledge Discovery from Biological and Clinical Data: MOTIVATION
5
Complete genomes are now available Knowing the genes is not enough to understand how biology functions Proteins, not genes, are responsible for many cellular activities Proteins function by interacting with other proteins and biomolecules GENOME PROTEOME INTERACTOME Driving Forces: Genes, Proteins, Interactions, Diagnosis, & Cures
6
If we figure out how these work, we get these Benefits To the patient: Better drug, better treatment To the pharma: Save time, save cost, make more $ To the scientist: Better science
7
To figure these out, we bet on... “solution” = Data Mgmt + Knowledge Discovery Data Mgmt = Integration + Transformation + Cleansing Knowledge Discovery = Statistics + Algorithms + Databases
8
Knowledge Discovery from Biological and Clinical Data: ACCOMPLISHMENT
9
Predict Epitopes, Find Vaccine Targets Vaccines are often the only solution for viral diseases Finding & developing effective vaccine targets (epitopes) is slow and expensive process Develop systems to recognize protein peptides that bind MHC molecules Develop systems to recognize hot spots in viral antigens
10
Recognize Functional Sites, Help Scientists Effective recognition of initiation, control, and termination of biological processes is crucial to speeding up and focusing scientific experiments Data mining of bio seqs to find rules for recognizing & understanding functional sites Dragon’s 10x reduction of TSS recognition false positives
11
Diagnose Leukaemia, Benefit Children Childhood leukaemia is a heterogeneous disease Treatment is based on subtype 3 different tests and 4 different experts are needed for diagnosis Curable in USA, fatal in Indonesia A single platform diagnosis based on gene expression Data mining to discover rules that are easy for doctors to understand
12
Understand Proteins, Fight Diseases Understanding function and role of protein needs organised info on interaction pathways Such info are often reported in scientific paper but are seldom found in structured databases Knowledge extraction system to process free text extract protein names extract interactions
13
Knowledge Discovery from Biological and Clinical Data: OPPORTUNITY
14
Objectives –Translate inspiration from biological systems into advancement of life and computing sciences –Advance data mining technologies in decision systems for complex problems Direction & Plan To work on practical systems for –data mining –data cleansing –knowledge extraction Applied to –gene regulation –protein interaction –clinical data analysis –ligand-receptor interaction
15
a b It seems that configuration a is less likely than b. Can we exploit this? E.g., How to Get More Out of the Same Experiments? How to recognize false positives from two-hybrid and other types of high-throughput protein interaction experiments? Some initial thoughts:
16
E.g., How to Improve Classifier Algorithms? SVM, ANN, etc. –Good accuracy, –but not easy to understand C4.5, CART, etc. –Clear rules, –but lower accuracy Why can’t we have a classifier algorithm that –handles high dimension –achieves high accuracy –provides understandable rules
17
Who will you be working with... Limsoon WongSee-Kiong Ng Jinyan LiVladimir Bajic Vladimir Brusic I2RI2R SOC Mong Li Lee Ken Sung Wynne Hsu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.