MUDIM (Petr Šimeček, Euromise) system for multidimensional compositional models (Radim Jiroušek) C++ code, distributed as R-package focused on medical.

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Economics 105: Statistics GH 24 due Wednesday. Hypothesis Tests on Several Regression Coefficients Consider the model (expanding on GH 22) Is “race” as.
Chapter Describing the Relation between Two Variables © 2010 Pearson Prentice Hall. All rights reserved 3 4.
Correlation and regression
OVERVIEW OF RESEARCH METHODS l How are Research Methods Important? How are Research Methods Important? l What is Descriptive Research? What is Descriptive.
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Intro to Statistics for the Behavioral Sciences PSYC 1900
Review: The Logic Underlying ANOVA The possible pair-wise comparisons: X 11 X 12. X 1n X 21 X 22. X 2n Sample 1Sample 2 means: X 31 X 32. X 3n Sample 3.
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Ch. 14: The Multiple Regression Model building
Correlation and Regression Analysis
Linear Regression Modeling with Data. The BIG Question Did you prepare for today? If you did, mark yes and estimate the amount of time you spent preparing.
Comorbidity in SLE Compared with Rheumatoid Arthritis and Non-inflammatory Disorders Frederick Wolfe 1, Kaleb Michaud 1,2, Tracy Li 3, Robert S. Katz 4.
Multiple Choice Questions for discussion
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
Jennifer Back Econometrics & Forecasting Regression Presentation.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Logistic Regression Pre-Challenger Relation Between Temperature and Field-Joint O-Ring Failure Dalal, Fowlkes, and Hoadley (1989). “Risk Analysis of the.
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Amsterdam Rehabilitation Research Center | Reade Multiple regression analysis Analysis of confounding and effectmodification Martin van de Esch, PhD.
R Programming Odds & Odds Ratios 1. Session 3 Overview 1.Odds 2.Odds Ratio (OR) 3.Confidence Intervals for OR’s 4.Inference based on OR’s 2.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.
Data analysis is largely a search for patterns – that is, for meaningful relations among various items observed - K. Godfrey.
Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague.
Chapter 10 Correlation and Regression
Dynamic Lines. Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very.
Statistical Methods Statistical Methods Descriptive Inferential
Elementary Statistics Correlation and Regression.
Areej Jouhar & Hafsa El-Zain Biostatistics BIOS 101 Foundation year.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Slide Slide 1 Warm Up Page 536; #16 and #18 For each number, answer the question in the book but also: 1)Prove whether or not there is a linear correlation.
1.State your research hypothesis in the form of a relation between two variables. 2. Find a statistic to summarize your sample data and convert the above.
Today - Messages Additional shared lab hours in A-269 –M, W, F 2:30-4:25 –T, Th 4:00-5:15 First priority is for PH5452. No TA or instructor Handouts –
Logistic Regression Applications Hu Lunchao. 2 Contents 1 1 What Is Logistic Regression? 2 2 Modeling Categorical Responses 3 3 Modeling Ordinal Variables.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Apr. 22 Stat 100. Final Wednesday April 24 About 40 or so multiple choice questions Comprehensive Study the midterms Copies and answers are at the course.
Fitting a Logit Model with a Polytomous Response Variable.
Statistical planning and Sample size determination.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
ECML/PKDD 2003 Discovery Challenge Attribute-Value and First Order Data Mining within the STULONG project Anneleen Van Assche, Sofie Verbaeten,
Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition, we would like to.
Simple linear regression Tron Anders Moger
Applied Statistics Week 4 Exercise 3 Tick bites and suspicion of Borrelia Mihaela Frincu
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Coffee and Cardiovascular Disease
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
1 Mining Episode Rules in STULONG dataset N. Méger 1, C. Leschi 1, N. Lucas 2 & C. Rigotti 1 1 INSA Lyon - LIRIS FRE CNRS Université d’Orsay – LRI.
Bio-Statistic KUEU 3146 & KBEB 3153 Bio-Statistic Prof Madya Dr W Mohd Azhar Wan Ibrahim
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Conditional Independence Farrokh Alemi Ph.D. Professor of Health Administration and Policy College of Health and Human Services, George Mason University.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Hypothesis Testing Example 3: Test the hypothesis that the average content of containers of a particular lubricant is 10 litters if the contents of random.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Introduction to Biostatistics Lecture 1. Biostatistics Definition: – The application of statistics to biological sciences Is the science which deals with.
SDS-Rules and Classification Tomáš Karban ECML/PKDD 2003 – Dubrovnik (Cavtat) September 22, 2003.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
REGRESSION G&W p
Statistics 200 Lecture #5 Tuesday, September 6, 2016
SAMPLE SIZE DETERMINATION
SA3202 Statistical Methods for Social Sciences
EQ: How well does the line fit the data?
Inference for Regression
Tutorial 4 For the Seat Belt Data, the Death Penalty Data, and the University Admission Data, (1). Identify the response variable and the explanatory.
Chapter 14 Inference for Regression
Regression & Correlation (1)
Presentation transcript:

MUDIM (Petr Šimeček, Euromise) system for multidimensional compositional models (Radim Jiroušek) C++ code, distributed as R-package focused on medical applications

Contents: idea of conditional independence and (de)composition possible applications of MUDIM expert system data mining STULONG dataset

CI - Theory of Storks BIRTH RATE STORK POPULATION

CI - Theory of Storks BIRTH RATE STORK POPULATION Statistically connected Do storks deliver newborns?

CI - Theory of Storks BIRTH RATE STORK POPULATION ENVIRONMENT No!

CI - Theory of Storks BIRTH RATE STORK POPULATION ENVIRONMENT connected

CI – Weather WEATHER YESTERDAY WEATHER TOMORROW WEATHER TODAY

CI – Weather WEATHER YESTERDAY WEATHER TODAY WEATHER TOMORROW

CI – Sample Medical Data = variable (attribute); f.e. AGE, BLOOD PREASURE, …

CI – Sample Medical Data (unconditional) statistical connection (correlation) between the pair of variables = = variable (attribute); f.e. AGE, BLOOD PREASURE, …

CI – Storks & Weather BIRTH RATE STORK POPULATION ENVIRONMENT YESTERDAY TODAY TOMORROW

CI – Storks & Weather BIRTH RATE STORK POPULATION ENVIRONMENT YESTERDAY TODAY TOMORROW

CI – Sample Medical Data causality between the pair of variables = = variable (attribute); f.e. AGE, BLOOD PREASURE, …

Locality - illustration Variable X Directly explanatory variables for X Other variables If we know information about directly explanatory variables for X, then knowledge about other explanatory variables is useless for predicting X.

Applications – Expert Systems Causality

Applications – Expert Systems Causality

Applications – Expert Systems Causality

Applications – Expert Systems Causality

Applications – Expert Systems Causality

Idea of Compositional Models

Applications – Expert Systems Causality What is the distribution ofif we know ?

Data Mining We don’t know “anything”, there are lots of variables and lots of possible relations between them. We need to formulate possible hypothesis, suggest some promising models, etc. (useful in pre-research).

Data Mining Variables Data

Direction of Causality Problem is equivalent to are equivalent, but they are not equivalent to

STULONG Dataset = Dataset containing research data on cardiovascular disease ( ) 1417 patients (Czech middle-aged men) 244 attributes surveyed with each patient at the entry examination 37 selected attributes are described here

(Incomplete) List of Attributes AGE MARITAL STATUS EDUCATION OCCUPATION PHISICAL ACTIVITY TRANSPORT TO JOB SMOKING ALCOHOL TEA AND COFFEE MYOCARDIAL INFARCTION HYPERTENSION ICTUS HYPERLIPIDEMIA CHEST PAIN ASTHMA HEIGHT & WEIGHT BLOOD PREASURE …

Graph of Correlated Pairs 464 of 666 possible pairs are statistically connected (p=0.05)

Graph of Correlated Pairs of 666 possible pairs are statistically connected (p=0.05/666)

56 arrows

Risk Factors for Hypertension >summary(glm(HT~HYPLIP+IM+AGE+SUBSC,data=C,family="bino mial")) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) *** IM * HYPLIP *** SUBSC * AGE Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Risk Factors for Hypertension Interpretation: HYPERLIPIDEMIA and IM triple odds of ratio Each three years of AGE double odds of ratio There is also small, but evincible connection to skinfold above musculus subscapularis (SUBSC)