Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rank-Based Approach to Optimal Score via Dimension Reduction Shao-Hsuan Wang National Taiwan University, Taiwan Nov 2015 1.

Similar presentations


Presentation on theme: "Rank-Based Approach to Optimal Score via Dimension Reduction Shao-Hsuan Wang National Taiwan University, Taiwan Nov 2015 1."— Presentation transcript:

1 Rank-Based Approach to Optimal Score via Dimension Reduction Shao-Hsuan Wang National Taiwan University, Taiwan Nov 2015 1

2 Rank-based measures Kendall’s  Concordance Index Rank correlation Widely used in medical statistics, epidemiology, economics, and sociology, etc. 2

3 Rank-based measures Regression Model Y : a univariate response Z Z  (Z(Z,, Z ) : multiple covariates 1 p 3

4 Rank-based measures YRYR Response Y TZTZ TZRTZR Composite score 4

5 Rank-based measures (Y(Y T,  Z ) ( Y,  Z ) For pair of observations concordant : T 1 and 2 T, Y  Y and  T Z  Z Y  Y and  T Z  Z 1212 discordant : T Y  Y and  121121 T Z  Z Y 212212 T  Y and  T Z  Z 121121 2122121212 5

6 Rank-based measures Kendall’s   P(Y P(Y T  Y,  Z  T Z )  P ( Y  Y T,  Z  T Z)Z) 1212 Rank correlation 121121 T 212212 rc  P ( Y  Y,  Z  Z ) 1 212 Concordance Index TT CI  P (  Z  Z | Y  Y ) 1 212 6

7 Rank-based measures YRYR Response Y TZTZ TZRTZR Composite score 7

8 Rank-based measures There could not exist a monotonic association !! 8

9 Motivation

10 Composite score TZTZ g (Z) measurable functions 10

11 C-max YRYR Response Concordance-index function : C ( g )  P ( g ( Z g(Z)Rg(Z)R Composite score )  g ( Z )| Y  Y ) C 12121212 (g)(g) C-max : max  sup gFgF c Optimal score : m ( Z ) such that m  sup C ( g ) g  F 11 c

12 Intrinsic model behind Rank-based measures M1 Distributional assumption : Generalized Regression Model (Han 1987) M2 Structural assumption : Dimension Reduction (Li 1991, Cook 1991) 12

13 Intrinsic model behind Rank-based measures M1 a non-degenerate monotonic function on R YG(mYG(m d (Z),)(Z),) 0 13

14 Intrinsic model behind Rank-based measures M1 a non-degenerate monotonic function on R YG(mYG(m d (Z),)(Z),) 0 an unspecifed bivariate function strictly increasing at each component for the other one being fixed 14

15 Intrinsic model behind Rank-based measures M2 Y  D G ( m d (Z),)(Z),) 0 a multivariate polynomial of the unknown degree d 0 15

16 Intrinsic model behind Rank-based measures M2 Dimension Reduction m(Z)m(Z) T m(Bm(B Z)Z) d dk 0 0 00 (1) d 0 be the smallest degree such that YZYZ | m d (Z)(Z) 0 B (2) 0 {{ 01,,  0k00k0 } is a basis of the central subspace (CS) 16

17 Model Flexibility Linear regression model Y T 0 Z T Binary Choice model Accelerated Failure time model Y I ( log( Y ) 0 Z T 0 0) Z Generalized linear regression model (GLM) Non-monotonic regression model Y(Y( T 0 2 Z)Z) 17

18 Types of covariates all discrete but continuous covariates Covariates which moments could not exist 18

19 Theories Propositions: (1) Existence m ( Z  arg max C ( g ) d0d0 g (2) Uniqueness f ( Z )  arg max C ( g ) f(Z)f(Z) cm ( Z )  c (3) Optimality d0d0 g for a ploynomial f d0d0 d01d01 ( z ) of the degree d02d02 d 0 g( Z )  arg max C ( g ) g(Z)T(m(Z))g(Z)T(m(Z)) d g for some monotonic function T 0 19

20 Summary TZTZ could not be the best composite score Model flexibility Various types of covariates Optimal score : existence, uniqueness, and optimality 20

21 How to estimate d k 0 0 : structural degree : structural dimension S(BS(B ) : the central subspace 0 m ( BZ ) : the optimal score d k 0 C 0 max : the C-max

22 Estimation Procedure

23 Derive m ( Z ) by maximizing the concordance index function via Step1 d the generalized single-index form of the polynomial Tips:(1) dpdp m(Z)cZm(Z)cZ rjrj  T Z d    r  1p1p r0r1rprj1r0r1rprj1 n I(I( T Z  T Z,Y  Y ) (2) C (m ( Z )) C()C()  ijij i1j1i1j1 nd 0 nn n  i1j1i1j1 I (Y  Y ) ijij

24 Estimation Procedure Step 2 Apply the outer grandient approach to obtain B Tips :(1) T k m (u) mm (B(Bu)u) d dk 0 0 00 (2) col( S ( B ))  col(  m ( u )(  m T (u)) dW (u)) 0  p uRuR d0d0d0d0

25 Estimation Procedure Step 3 Derive the estimator of Tips :(1) m dk T (B(B k Z)Z) ZBTZBTZ k n I(I( T Z  T Z,Y  Y ) (2)  ˆ  arg max  (3) T  i1j1i1j1 T i n  i1j1i1j1 jijjij I (Y  Y ) ijij m (BZ) ˆ Z dk k

26 Estimation Procedure Step 4 Adopt the concordance-based generalized BIC to estimate T d, k, S ( B ), m ( BZ ), and C 000d0k0000d0k0 Tips : (1) IC ( d, k ) 0max T  nC (m(B log n kdkd Z ))  (C  1) ndkk with IC (0, k )  1/2 (2) (d,k)  arg max IC ( d, k ) 0  d,1   p  1 2 k

27 Asymptotic results Consistent model selection --- parsimonious model among the class of Correct models (d(d,k,k ) 0 0 n -consistency of estimators of T S ( B ) and m ( B Z)Z) 0 Asymptotic normality of estimators of C d0k00d0k00 max 27

28 Wine Data Vinho verde wine : red wine and white wine (from the Minho Region of Northern Portugal) Collected from May/2004 -February/2007 Red wine : sample size (n)=1599 White wine : n=4898 Physicochemical and sensory tests

29 Wine data Response (Y): Preferences 0 (bad) -10 (excellent) 11 Covariates (Z) : fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, PH, sulphates, and alcohol

30 Wine data 30

31 Wine data 31

32 Thank You !


Download ppt "Rank-Based Approach to Optimal Score via Dimension Reduction Shao-Hsuan Wang National Taiwan University, Taiwan Nov 2015 1."

Similar presentations


Ads by Google