Latent Structure Models and Statistical Foundation for TCM Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science.

Slides:



Advertisements
Similar presentations
Scientific Method Method of scientific investigation Four MAJOR steps:
Advertisements

1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Latent Tree Models Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. AAAI 2014 Tutorial.
Dynamic Bayesian Networks (DBNs)
Research Basics PE 357. What is Research? Can be diverse General definition is “finding answers to questions in an organized and logical and systematic.
Psychological Science
Lecture 15: Hierarchical Latent Class Models Based ON N. L. Zhang (2002). Hierarchical latent class models for cluster analysis. Journal of Machine Learning.
Biostatistics Frank H. Osborne, Ph. D. Professor.
Developing Ideas for Research and Evaluating Theories of Behavior
Latent Structure Models & Statistical Foundation for TCM Nevin L. Zhang The Hong Kong University of Science & Techology.
Introduction to Communication Research
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
INTRO TO PSYCHOLOGY.
Latent Tree Models Part II: Definition and Properties
An Evidence-Based Approach to
RESEARCH FRAMEWORK Yulia Sofiatin Department of Epidemiology and Biostatistics 2012 YS 2011.
SIMAD University Research Process Ali Yassin Sheikh.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
Research in Psychology
© 2011 Pearson Prentice Hall, Salkind. Introducing Inferential Statistics.
Nature and Scope of Marketing Research
Data Mining Chun-Hung Chou
Marketing Research: Overview
Research Methods Key Points What is empirical research? What is the scientific method? How do psychologists conduct research? What are some important.
Defining the Research Problem
LEARNING PRIORITY OF TECHNOLOGY PROCESS SKILLS AT ELEMENTARY LEVEL Hung-Jen Yang & Miao-Kuei Ho DEPARTMENT OF INDUSTRIAL TECHNOLOGY EDUCATION THE NATIONAL.
An Evidence-Based Approach to TCM Patient Class Definition and Differentiation Nevin L. Zhang The Hong Kong Univ. of Sci. & Tech.
Research in Computing สมชาย ประสิทธิ์จูตระกูล. Success Factors in Computing Research Research Computing Knowledge Scientific MethodAnalytical Skill Funding.
Education 793 Class Notes Welcome! 3 September 2003.
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
Intelligent Tutoring System for CS-I and II Laboratory Middle Tennessee State University J. Yoo, C. Pettey, S. Yoo J. Hankins, C. Li, S. Seo Supported.
Market Research Lesson 6. Objectives Outline the five major steps in the market research process Describe how surveys can be used to learn about customer.
The Scientific Method in Psychology.  Descriptive Studies: naturalistic observations; case studies. Individuals observed in their environment.  Correlational.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Chapter 1 Really briefly!. What is Chemistry? A) The next thing you have to take to get an advanced diploma. A) The next thing you have to take to get.
Latent Tree Models & Statistical Foundation for TCM Nevin L. Zhang Joint Work with: Chen Tao, Wang Yi, Yuan Shihong Department of Computer Science & Engineering.
The Sociological Research Process There are 2 types of sociological research: 1)Quantitative – the goal of this research is scientific objectivity, and.
Scientific Processes Mrs. Parnell. What is Science? The goal of science is to investigate and understand the natural world, to explain events in the natural.
The Sociological Perspective Chapter 2 Doing Sociology.
Anatomy Scientific Method. Scientific Method A standardized means of organizing and evaluating information to reach valid conclusions. **it’s a process!
For use only with Perreault and McCarthy texts. © The McGraw-Hill Companies, Inc., 1999 Irwin/McGraw-Hill Chapter 8: Improving Decisions with Marketing.
Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech.
Basics of Research and Development and Design STEM Education HON4013 ENGR1020 Learning and Action Cycles.
Nursing research Is a systematic inquiry into a subject that uses various approach quantitative and qualitative methods) to answer questions and solve.
“What Makes a Good Science/Technology Project” Derresa Davis-Tobin
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
QUALITATIVE RESEARCH What is the distinction between Inductive and Deductive research? Qualitative research methods – produces observations that are not.
EDU 5900 AB. RAHIM BAKAR 1 Research Methods in Education.
SCIENTIFIC METHOD. A researcher must follow scientific method for research to be considered valid. The following slides will discuss the procedure for.
+ Chapter Scientific Method variable is the factor that changes in an experiment in order to test a hypothesis. To test for one variable, scientists.
Introduction to research
 The goal is scientific objectivity, the focus is on data that can be measured numerically.
WHAT IS RESEARCH? According to Redman and Morry,
Introduction to Research. Objectives Introduction to Research (continued) Recap important concepts from previous class Quantitative vs. Qualitative Research.
Psychology 101: General  Chapter 1Part 2 Scientific Method Instructor: Mark Vachon.
Dr.Ali K Al-mesrawi. RESEARCH word is originated from the word “Researche”. Research = ‘Re’+ search’. Re means once again,anew, or a fresh. Search means.
Research in Psychology A Scientific Endeavor. Goals of Psychological Research Description of social behavior Are people who grow up in warm climates different.
Chapter 8 Introducing Inferential Statistics.
SEMINAR BUSINESS RESEARCH
Latent variable discovery in classification models
Background Information for Project
CSE 4705 Artificial Intelligence
Section 2: The Nature of Science
SCIENCE AND ENGINEERING PRACTICES
Science Chapter 1.
Section 2: The Nature of Science
Yulong Xu Henan University of Chinese Medicine
Science.
Welcome! Knowledge Discovery and Data Mining
Introduction to Science and the Scientific Method Science 8
Presentation transcript:

Latent Structure Models and Statistical Foundation for TCM Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology

INCOB 2007/ Slide 2 Outline l Hierarchical Latent Class (HLC) Models Hierarchical Latent Class (HLC) Models l Motivation Motivation l Empirical Results on TCM Data Empirical Results on TCM Data l Empirical Results on Other Data Empirical Results on Other Data l Conclusions Conclusions

INCOB 2007/ Slide 3 Hierarchical Latent Class (HLC) Models l Bayesian networks with n Rooted tree structure n Discrete random variables n Leaves observed (manifest variables) n Internal nodes latent (latent variables) l Renamed latent tree models

INCOB 2007/ Slide 4 Example l Manifest variables n Math Grade, Science Grade, Literature Grade, History Grade l Latent variables n Analytic Skill, Literal Skill, Intelligence

INCOB 2007/ Slide 5 Learning Latent Tree Models: The Problem Y1Y2…Y6Y7 10…11 11…00 01…01 …………… Determine l Number of latent variables l Cardinality of each latent variable l Model Structure l Conditional probability distributions Two perspectives l Latent structure discovery l Multidimensional clustering n Generalizing latent class analysis

INCOB 2007/ Slide 6 Learning Latent Tree Models: The Algorithms l Model Selection n Several scores examined: BIC, BICe, CS, AIC, holdout likelihood n BIC: best choice for the time being l Model optimization n Double hill climbing (DHC), 2002  7 manifest variables. n Single hill climbing (SHC), 2004  12 manifest variables n Heuristic SHC (HSHC), 2004  50 manifest variables n EAST, 2007  As efficient as HSHC, and finds better models n EAST + Divide-and-Conquer  100+ manifest variables

Illustration of the search process

INCOB 2007/ Slide 8 Motivation l Latent structure discovery and multidimensional clustering are potentially useful in many applications. l Our work driven by research on traditional Chinese medicine (TCM)

INCOB 2007/ Slide 9 What is there to be done? l TCM statement: n Yang deficiency ( 阳虚 ): intolerance to cold ( 畏寒 ), cold limbs ( 肢冷 ), cold lumbus and back ( 腰背冷 ), and so on …. n Regarded by many as not scientific, even groundless. l Two aspects to the meaning 1. Claim: There exists a class of patients, who characteristically have the cold symptoms. The cold symptoms co-occur in a group of people, 2. Explanation offered: Due to deficiency of Yang. It fails to warm the body l What to do? n Previous work focused on 2. n New idea: Do data analysis for 1

INCOB 2007/ Slide 10 Objectivity of the Claimed Pattern l TCM Claim: there exits a class of patients, in whom symptoms such as ‘intolerance to cold’, ‘cold limbs’, ‘cold lumbus and back’, and so on co-occur at the same time l How to prove or disapprove that such claimed TCM classes exist in the world? n Systematically collect data about symptoms of patients. n Perform cluster analysis, obtain natural clusters of patients n If the natural clusters corresponds to the TCM classes, then YES. 1.Existence of TCM classes validated 2.Descriptions of TCM classes refined and systematically expanded 3.Establish a statistical foundation for TCM

INCOB 2007/ Slide 11 Why Latent Tree Models? l TCM uses multiple interrelated latent concepts to explain co-occurrence of symptoms n Yang deficiency ( 肾阳虚 ), Yin deficiency ( 肾阴虚 ):, Essence insufficiency ( 肾 精亏虚 ), … l Need latent structure models n With multiple interrelated latent variables.. l Latent Tree Models are the simplest such models

INCOB 2007/ Slide 12 Empirical Results l Can we find the claimed TCM classes using latent tree models? n We collected a data set about kidney deficiency ( 肾虚 ) n 35 symptom variables, 2600 records

Result of Data Analysis l Y0-Y34: manifest variables from data l X0-X13: latent variables introduced by data analysis l Structure interesting, supports TCM’s theories about various symptoms.

INCOB 2007/ Slide 14 Latent Clusters l X1: n 5 states: s0, s1, s2, s3, s4 n Samples grouped into 5 clusters l Cluster X1=s4 {sample | P(X1=s4|sample) > 0.95}  Cold symptoms co-occur in samples l Class implicitly claimed by TCM found! l Description of class refined n By Math vs by words

INCOB 2007/ Slide 15 Other TCM Data Sets l From Beijing U of TCM, 973 project n Depression Depression n Hepatitis B Hepatitis B n Chronic Renal Failure Chronic Renal Failure n Other data to be analyzed l China Academy of TCM n Subhealth Subhealth n Type 2 Diabetes Type 2 Diabetes n More analysis to come under a new 973 project l In all cases, claimed TCM classes n Validated n Quantified and refined

INCOB 2007/ Slide 16 Results on a Marketing Data Set l CoiL Challenge 2000 l Customer records of a Holland Insurance Company l 42 manifest variables, 5822 records

INCOB 2007/ Slide 17 Results on a Danish Beer Data l Market Research l 783 samples l States of Manifest variables n 1. Never heard of; 2. heard but not tasted; n 3. tasted but don’t drink regularly; 4. drink regularly

INCOB 2007/ Slide 18 Result on a Survey Data Set l Survey on corruption l 31 manifest variables, records

INCOB 2007/ Slide 19 Conclusions l Latent tree models, and latent structure models in general, n Offer framework for latent structure discovery and multidimensional clustering. n Can play a fundamental role in modernizing TCM n Can be useful in many other areas  such as marketing, survey studies, …. l We have only scratched the surface. A lot of interesting research work is yet to be done.