Download presentation
Presentation is loading. Please wait.
Published byRoy Cannon Modified over 9 years ago
1
Rank-Based Approach to Optimal Score via Dimension Reduction Shao-Hsuan Wang National Taiwan University, Taiwan Nov 2015 1
2
Rank-based measures Kendall’s Concordance Index Rank correlation Widely used in medical statistics, epidemiology, economics, and sociology, etc. 2
3
Rank-based measures Regression Model Y : a univariate response Z Z (Z(Z,, Z ) : multiple covariates 1 p 3
4
Rank-based measures YRYR Response Y TZTZ TZRTZR Composite score 4
5
Rank-based measures (Y(Y T, Z ) ( Y, Z ) For pair of observations concordant : T 1 and 2 T, Y Y and T Z Z Y Y and T Z Z 1212 discordant : T Y Y and 121121 T Z Z Y 212212 T Y and T Z Z 121121 2122121212 5
6
Rank-based measures Kendall’s P(Y P(Y T Y, Z T Z ) P ( Y Y T, Z T Z)Z) 1212 Rank correlation 121121 T 212212 rc P ( Y Y, Z Z ) 1 212 Concordance Index TT CI P ( Z Z | Y Y ) 1 212 6
7
Rank-based measures YRYR Response Y TZTZ TZRTZR Composite score 7
8
Rank-based measures There could not exist a monotonic association !! 8
9
Motivation
10
Composite score TZTZ g (Z) measurable functions 10
11
C-max YRYR Response Concordance-index function : C ( g ) P ( g ( Z g(Z)Rg(Z)R Composite score ) g ( Z )| Y Y ) C 12121212 (g)(g) C-max : max sup gFgF c Optimal score : m ( Z ) such that m sup C ( g ) g F 11 c
12
Intrinsic model behind Rank-based measures M1 Distributional assumption : Generalized Regression Model (Han 1987) M2 Structural assumption : Dimension Reduction (Li 1991, Cook 1991) 12
13
Intrinsic model behind Rank-based measures M1 a non-degenerate monotonic function on R YG(mYG(m d (Z),)(Z),) 0 13
14
Intrinsic model behind Rank-based measures M1 a non-degenerate monotonic function on R YG(mYG(m d (Z),)(Z),) 0 an unspecifed bivariate function strictly increasing at each component for the other one being fixed 14
15
Intrinsic model behind Rank-based measures M2 Y D G ( m d (Z),)(Z),) 0 a multivariate polynomial of the unknown degree d 0 15
16
Intrinsic model behind Rank-based measures M2 Dimension Reduction m(Z)m(Z) T m(Bm(B Z)Z) d dk 0 0 00 (1) d 0 be the smallest degree such that YZYZ | m d (Z)(Z) 0 B (2) 0 {{ 01,, 0k00k0 } is a basis of the central subspace (CS) 16
17
Model Flexibility Linear regression model Y T 0 Z T Binary Choice model Accelerated Failure time model Y I ( log( Y ) 0 Z T 0 0) Z Generalized linear regression model (GLM) Non-monotonic regression model Y(Y( T 0 2 Z)Z) 17
18
Types of covariates all discrete but continuous covariates Covariates which moments could not exist 18
19
Theories Propositions: (1) Existence m ( Z arg max C ( g ) d0d0 g (2) Uniqueness f ( Z ) arg max C ( g ) f(Z)f(Z) cm ( Z ) c (3) Optimality d0d0 g for a ploynomial f d0d0 d01d01 ( z ) of the degree d02d02 d 0 g( Z ) arg max C ( g ) g(Z)T(m(Z))g(Z)T(m(Z)) d g for some monotonic function T 0 19
20
Summary TZTZ could not be the best composite score Model flexibility Various types of covariates Optimal score : existence, uniqueness, and optimality 20
21
How to estimate d k 0 0 : structural degree : structural dimension S(BS(B ) : the central subspace 0 m ( BZ ) : the optimal score d k 0 C 0 max : the C-max
22
Estimation Procedure
23
Derive m ( Z ) by maximizing the concordance index function via Step1 d the generalized single-index form of the polynomial Tips:(1) dpdp m(Z)cZm(Z)cZ rjrj T Z d r 1p1p r0r1rprj1r0r1rprj1 n I(I( T Z T Z,Y Y ) (2) C (m ( Z )) C()C() ijij i1j1i1j1 nd 0 nn n i1j1i1j1 I (Y Y ) ijij
24
Estimation Procedure Step 2 Apply the outer grandient approach to obtain B Tips :(1) T k m (u) mm (B(Bu)u) d dk 0 0 00 (2) col( S ( B )) col( m ( u )( m T (u)) dW (u)) 0 p uRuR d0d0d0d0
25
Estimation Procedure Step 3 Derive the estimator of Tips :(1) m dk T (B(B k Z)Z) ZBTZBTZ k n I(I( T Z T Z,Y Y ) (2) ˆ arg max (3) T i1j1i1j1 T i n i1j1i1j1 jijjij I (Y Y ) ijij m (BZ) ˆ Z dk k
26
Estimation Procedure Step 4 Adopt the concordance-based generalized BIC to estimate T d, k, S ( B ), m ( BZ ), and C 000d0k0000d0k0 Tips : (1) IC ( d, k ) 0max T nC (m(B log n kdkd Z )) (C 1) ndkk with IC (0, k ) 1/2 (2) (d,k) arg max IC ( d, k ) 0 d,1 p 1 2 k
27
Asymptotic results Consistent model selection --- parsimonious model among the class of Correct models (d(d,k,k ) 0 0 n -consistency of estimators of T S ( B ) and m ( B Z)Z) 0 Asymptotic normality of estimators of C d0k00d0k00 max 27
28
Wine Data Vinho verde wine : red wine and white wine (from the Minho Region of Northern Portugal) Collected from May/2004 -February/2007 Red wine : sample size (n)=1599 White wine : n=4898 Physicochemical and sensory tests
29
Wine data Response (Y): Preferences 0 (bad) -10 (excellent) 11 Covariates (Z) : fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, PH, sulphates, and alcohol
30
Wine data 30
31
Wine data 31
32
Thank You !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.