Latent Tree Models & Statistical Foundation for TCM Nevin L. Zhang Joint Work with: Chen Tao, Wang Yi, Yuan Shihong Department of Computer Science & Engineering.

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

Latent Tree Models Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. AAAI 2014 Tutorial.
Understanding the Research Process
Dynamic Bayesian Networks (DBNs)
Experiments We measured the times(s) and number of expanded nodes to previous heuristic using BFBnB. Dynamic Programming Intuition. All DAGs must have.
Artificial Intelligence Lecture
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Latent Structure Models and Statistical Foundation for TCM Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science.
COMP 328: Final Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
Psychological Science
Lecture 16: Wrap-Up COMP 538 Introduction of Bayesian networks.
L11: Uses of Bayesian Networks Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
From Discrete Mathematics to AI applications: A progression path for an undergraduate program in math Abdul Huq Middle East College of Information Technology,
Lecture 15: Hierarchical Latent Class Models Based ON N. L. Zhang (2002). Hierarchical latent class models for cluster analysis. Journal of Machine Learning.
Chapter 1 Conducting & Reading Research Baumgartner et al Chapter 1 Nature and Purpose of Research.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Biostatistics Frank H. Osborne, Ph. D. Professor.
Developing Ideas for Research and Evaluating Theories of Behavior
Latent Structure Models & Statistical Foundation for TCM Nevin L. Zhang The Hong Kong University of Science & Techology.
1 gR2002 Peter Spirtes Carnegie Mellon University.
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
Latent Tree Models Part II: Definition and Properties
An Evidence-Based Approach to
RESEARCH FRAMEWORK Yulia Sofiatin Department of Epidemiology and Biostatistics 2012 YS 2011.
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin.
© 2011 Pearson Prentice Hall, Salkind. Introducing Inferential Statistics.
Data Mining Chun-Hung Chou
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Research Methods Key Points What is empirical research? What is the scientific method? How do psychologists conduct research? What are some important.
An Evidence-Based Approach to TCM Patient Class Definition and Differentiation Nevin L. Zhang The Hong Kong Univ. of Sci. & Tech.
Research in Computing สมชาย ประสิทธิ์จูตระกูล. Success Factors in Computing Research Research Computing Knowledge Scientific MethodAnalytical Skill Funding.
Education 793 Class Notes Welcome! 3 September 2003.
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
Intelligent Tutoring System for CS-I and II Laboratory Middle Tennessee State University J. Yoo, C. Pettey, S. Yoo J. Hankins, C. Li, S. Seo Supported.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
THE EFFECTS OF SOCIAL INTEGRATION ON SELF-RATED HEALTH AMONG OLDER ADULTS IN URBAN CHINA Iris Chi, D.S.W. Weiyu Mao, M.Phil., Ph.D. Candidate 2012 Joint.
Latent Tree Analysis of Unlabeled Data Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech.
PSY 1950 Chance, Probability, and Sampling September 24, 2008.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
CCF 贝叶斯网络在中国的应用和发展学术沙龙 香港科技大学 BN 理论研究和应用的情况
Software Architectural Assumptions in Software Architecting Chen Yang a,b, Peng Liang a, Paris Avgeriou b a State Key Lab of Software Engineering, Wuhan.
Academic Research Academic Research Dr Kishor Bhanushali M
Question paper 1997.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
CHAPTER OVERVIEW Say Hello to Inferential Statistics The Idea of Statistical Significance Significance Versus Meaningfulness Meta-analysis.
A Research of Methodology and Its Application on the Performance Assessments of the Local Governments in China CUI Ping Beijing Municipal Bureau of Statistics.
Human and Optimal Exploration and Exploitation in Bandit Problems Department of Cognitive Sciences, University of California. A Bayesian analysis of human.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Introduction to research
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
WHAT IS RESEARCH? According to Redman and Morry,
Psychology 101: General  Chapter 1Part 2 Scientific Method Instructor: Mark Vachon.
Chapter 8 Introducing Inferential Statistics.
CSE 4705 Artificial Intelligence
Latent variable discovery in classification models
Latent Tree Analysis Nevin L. Zhang* and Leonard K. M. Poon**
MIS2502: Data Analytics Advanced Analytics - Introduction
School of Computer Science & Engineering
RESEARCH APPROACH.
LECTURE 10: EXPECTATION MAXIMIZATION (EM)
Conceptual Frameworks, Models, and Theories
What is Pattern Recognition?
Pattern Recognition and Image Analysis
Science.
Research process.
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Latent Tree Models & Statistical Foundation for TCM Nevin L. Zhang Joint Work with: Chen Tao, Wang Yi, Yuan Shihong Department of Computer Science & Engineering The Hong Kong University of Science & Technology

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 2 Publications N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Latent tree models and diagnosis in traditional Chinese medicine. Artificial Intelligence in Medicine. 42, N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Statistical Validation of TCM Theories. Journal of Alternative and Complementary Medicine. Accepted. N. L. Zhang, S. H. Yuan, T. Chen, and Y. Wang (2007). Hierarchical Latent Class Models and Statistical Foundation for Traditional Chinese Medicine 11th Conference on Artificial Intelligence in Medicine (AIME 07), 07-11, July 2007, Amsterdam, The Netherlands.

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 3 Latent Tree Models (LTM) l Bayesian networks with n Rooted tree structure n Discrete random variables n Leaves observed (manifest variables) n Internal nodes latent (latent variables) l Also known as hierarchical latent class (HLC) models, HLC models P(Y1), P(Y2|Y1), P(X1|Y2), P(X2|Y2), …

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 4 Example l Manifest variables n Math Grade, Science Grade, Literature Grade, History Grade l Latent variables n Analytic Skill, Literal Skill, Intelligence

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 5 Learning Latent Tree Models: The problem X1X2…X6X7 10…11 11…00 01…01 …………… Determine l Number of latent variables l Cardinality of each latent variable l Model Structure l Conditional probability distributions

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 6 Learning Latent Tree Models: The Algorithms l Model Selection n Several scores examined: BIC, BICe, CS, AIC, holdout likelihood n BIC: best choice for the time being l Model optimization n Double hill climbing (DHC), 2002  7 manifest variables. n Single hill climbing (SHC), 2004  12 manifest variables n Heuristic SHC (HSHC), 2004  50 manifest variables n EAST, 2008  As efficient as HSHC, and finds better models

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 7 Traditional Chinese Medicine (TCM) l TCM statement: n Yang deficiency ( 阳虚 ): intolerance to cold ( 畏寒 ), cold limbs ( 肢冷 ), cold lumbus and back ( 腰背冷 ), and so on …. n Regarded by many as not scientific, even groundless. l Two aspects to the meaning 1. Claim: There exists a class of patients, who characteristically have the cold symptoms. The cold symptoms co-occur in a group of people, 2. Explanation offered: Due to deficiency of Yang. It fails to warm the body l What to do? n Previous work focused on 2. n New idea: Do data analysis for 1

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 8 Objectivity of the Claimed Pattern l TCM Claim: there exits a class of patients, in whom symptoms such as ‘intolerance to cold’, ‘cold limbs’, ‘cold lumbus and back’, and so on co-occur at the same time l How to prove or disapprove that such claimed TCM classes exist in the world? n Systematically collect data about symptoms of patients. n Perform cluster analysis, obtain natural clusters of patients n If the natural clusters corresponds to the TCM classes, then YES. 1.Existence of TCM classes validated 2.Descriptions of TCM classes refined and systematically expanded 3.Establish a statistical foundation for TCM

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 9 Why Latent Tree Models? l TCM uses multiple interrelated latent concepts to explain co-occurrence of symptoms n Yang deficiency ( 肾阳虚 ), Yin deficiency ( 肾阴虚 ):, Essence insufficiency ( 肾 精亏虚 ), … TCM theories are latent structure models in natural language. l Need latent structure models n With multiple interrelated latent variables.. l Latent Tree Models are the simplest such models

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 10 Empirical Results l Can we find the claimed TCM classes using latent tree models? n We collected a data set about kidney deficiency ( 肾虚 ) n 35 symptom variables, 2600 records

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 11 Result of Data Analysis l Y0-Y34: manifest variables from data l X0-X13: latent variables introduced by data analysis l Structure interesting, supports TCM’s theories about various symptoms. (Zhang et al. 2008, AI in Medicine)

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 12 Latent Clusters l X1: n 5 states: s0, s1, s2, s3, s4 n Samples grouped into 5 clusters l Cluster X1=s4 {sample | P(X1=s4|sample) > 0.95}  Cold symptoms co-occur in samples l Class implicitly claimed by TCM found! l Description of class refined n By Math vs by words

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 13 Statistical Validation of TCM Theory Experiences TCM Theory Ancient Times Data LT Model

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 14 Other TCM Data Sets l From Beijing U of TCM, 973 project n Depression Depression n Hepatitis B Hepatitis B n Chronic Renal Failure Chronic Renal Failure n … l China Academy of TCM n Subhealth Subhealth n Type 2 Diabetes Type 2 Diabetes l In all cases, claimed TCM classes n Validated n Quantified and refined

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 15 Another Perspective l Just now: validation of TCM theory. l Another perspective: improve diagnosis n TCM diagnosis: classification n Problems: boundaries between classes not clear n Our work is helpful in clarifying the boundaries

Learning Latent Tree Models & TCM ASEAN-China IBW: Page 16 Conclusions l Latent tree models, and latent structure models in general, offer framework for n Density estimation n Latent structure discovery n Multidimensional clustering. n Can play a fundamental role in modernizing TCM n Can be useful in many other areas  Probabilistic inference, classification, semi-supervised learning…  marketing, survey studies, …. l We have only scratched the surface.

Thank You!