Comparative Knowledge Discovery with Partial Order and Composite Indicator Partial Order Ranking of Objects with Weights for Indicators and Its Representability.

Slides:



Advertisements
Similar presentations
The simplex algorithm The simplex algorithm is the classical method for solving linear programs. Its running time is not polynomial in the worst case.
Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.
Linear Programming (LP) (Chap.29)
Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Linear Algebra Applications in Matlab ME 303. Special Characters and Matlab Functions.
Chapter 3 Image Enhancement in the Spatial Domain.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
Chapter 6 Linear Programming: The Simplex Method Section 3 The Dual Problem: Minimization with Problem Constraints of the Form ≥
Chapter 2 Matrices Finite Mathematics & Its Applications, 11/e by Goldstein/Schneider/Siegel Copyright © 2014 Pearson Education, Inc.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Operation Research Chapter 3 Simplex Method.
Linear Programming.
Principal Component Analysis
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Lecture 6 Resolution and Generalized Inverses. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Chapter 5 Orthogonality
Offset of curves. Alina Shaikhet (CS, Technion)
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Goldstein/Schnieder/Lay: Finite Math & Its Applications, 9e 1 of 86 Chapter 2 Matrices.
Information Retrieval in Text Part III Reference: Michael W. Berry and Murray Browne. Understanding Search Engines: Mathematical Modeling and Text Retrieval.
Computer Algorithms Integer Programming ECE 665 Professor Maciej Ciesielski By DFG.
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Backtracking Reading Material: Chapter 13, Sections 1, 2, 4, and 5.
Finite Mathematics & Its Applications, 10/e by Goldstein/Schneider/SiegelCopyright © 2010 Pearson Education, Inc. 1 of 86 Chapter 2 Matrices.
Solving Linear Programming Problems Using Excel Ken S. Li Southeastern Louisiana University.
Lecture II-2: Probability Review
Radial Basis Function Networks
The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
LINEAR PROGRAMMING SIMPLEX METHOD.
Summarized by Soo-Jin Kim
Systems of Linear Equation and Matrices
1. The Simplex Method.
14 Elements of Nonparametric Statistics
Simplex method (algebraic interpretation)
Some Key Facts About Optimal Solutions (Section 14.1) 14.2–14.16
Finite Element Method.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
AGC DSP AGC DSP Professor A G Constantinides©1 Hilbert Spaces Linear Transformations and Least Squares: Hilbert Spaces.
Chapter 6 Linear Programming: The Simplex Method Section 3 The Dual Problem: Minimization with Problem Constraints of the Form ≥
Chapter 3 Linear Programming Methods
Spectral Sequencing Based on Graph Distance Rong Liu, Hao Zhang, Oliver van Kaick {lrong, haoz, cs.sfu.ca {lrong, haoz, cs.sfu.ca.
Tetris Agent Optimization Using Harmony Search Algorithm
McGraw-Hill/Irwin © The McGraw-Hill Companies, Inc., Table of Contents CD Chapter 14 (Solution Concepts for Linear Programming) Some Key Facts.
University of Colorado at Boulder Yicheng Wang, Phone: , Optimization Techniques for Civil and Environmental Engineering.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Dynamic Programming.  Decomposes a problem into a series of sub- problems  Builds up correct solutions to larger and larger sub- problems  Examples.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
1 Optimization Techniques Constrained Optimization by Linear Programming updated NTU SY-521-N SMU EMIS 5300/7300 Systems Analysis Methods Dr.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
OR  Now, we look for other basic feasible solutions which gives better objective values than the current solution. Such solutions can be examined.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
1 Multi-criterion Ranking and Poset Prioritization G. P. Patil December 2004 – January 2005.
Linear Programming Many problems take the form of maximizing or minimizing an objective, given limited resources and competing constraints. specify the.
Computation of the solutions of nonlinear polynomial systems
Linear Programming.
Craig Schroeder October 26, 2004
1.206J/16.77J/ESD.215J Airline Schedule Planning
Chapter 8. General LP Problems
Feature space tansformation methods
Chapter 8. General LP Problems
Chapter 8. General LP Problems
Chapter 2. Simplex method
Presentation transcript:

Comparative Knowledge Discovery with Partial Order and Composite Indicator Partial Order Ranking of Objects with Weights for Indicators and Its Representability by a Composite Indicator.

Objective To develop data based methodology for multicriterion prioritization using partial order To determine if a partial order ranking can be represented by a composite indicator. To determine equivalence classes of composite indicators To seek reconciliation between stake holder designed composite indicator and partial order ranking based on evidence supplied by data

Problem Rank a set of n objects based on some intrinsic quality Intrinsic quality is not directly measurable Measurements on m surrogate indicators are used Data matrix: n by m matrix with columns with one column for one indicator

Approach Modification of classical partial order (POSET) ranking –Use stochastic ordering based on cumulative rank frequency (CRF) distribution of objects instead of averaging ranks assigned by linear extensions –Use indicator weights in forming CRF’s based on data –Indicator weights are based on proximity of linear extensions to ranks assigned by individual indicator columns to objects

Why Weighting Expert opinions and stakeholders’ interests may need to be incorporated. Makes sense to take into consideration weights based on data to prioritize. This allows us an opportunity for reconciliation between stakeholder based and evidence based weighted rankings.

Data based weights Traditional partial order based prioritization does not use any weighting. Instead, a novel modification that involves cumulative rank frequencies allows us to use weights. Weights can originate from an external, stakeholder source or can be data based. We begin with data based weights for POSET ranking.

Data Based Weighted POSET Ranking (DBWPR) Iterative method. Initially, uses uniform weights to form cumulative rank frequency distribution (CRF) matrix based on object ranks by each indicator column. The CRF matrix is used in place of the original data matrix The CRF method due to Patil and Taillie (2004) is used to compute POSET ranking Also data-based proximity weights for different indicators based on correlation between ranks of objects assigned by linear extensions and ranks of objects in indicator columns are computed End of first iteration

DBWPR continued Subsequent iterations New CRF matrix is computed based on object ranks by each indicator column. The CRF matrix is used in place of the original data matrix The CRF method due to Patil and Taillie (2004) is used to compute POSET ranking Also data-based proximity weights for different indicators based on correlation between ranks of objects assigned by linear extensions and ranks of objects in indicator columns are computed If data based weights match data based weights of the previous accept the most recent ranking as DBWPR and terminate iteration else do another iteration.

An Example 10 objects, 3 indicators Data matrix Rank Matrix IDq1q2q IDq1q2q

Rank Frequency Matrix with equal weights for three indicators ID

Cumulative Rank Frequency Matrix (CRFM) ID

Example (Continued)

Equivalent Weights Given a composite indicator with a given vector of weights, it is of interest to find another equivalent composite indicator, that is, another vector of weights that will yield a composite indicator inducing the same ranking that the given indicator does. Aside from academic curiosity, knowing all such classes of equivalent weight vectors can be used to bring about reconciliation between competing indicators.

Determining Regions of Equivalent Weights For each object there is a vector of m indicator values in the m dimensional Euclidean space – a data vector. Given a vector of weights = (w 1, w 2, …, w n ), corresponding composite indicator values for different objects are projections of data vectors on the vector of weights. Determination of regions of equivalent weights amounts to examining, for each pair of incomparable objects, the position of the weight vector relative to the two respective data vectors such that they have equal projections on the weight vector. Intersection of the weight vector having equal projections of two incomparable data vectors with the weight space w 1 + w 2 … +w n = 1 divides the weight space into two. All such intersections arising from all pairs of incomparable data vectors partitions the weight space into equivalence classes of weight vectors.

An illustration with two indicator space The vector w’ has equal projections of data vectors x and y. Vector w’ intersects the line w1 + w2 = 1 at a certain point. If we look at such intersections of different weight vectors with respect to all pairs of incomparable objects we get a partition of the line w1+w2 = 1. Any two weight vectors with their intersection with w1+w2 = 1 in the interior of an interval of the partition, will define equivalent composite indicators.

Example 1. 9 by 2 Data Matrix Objectq1q2 a93 b27 c61 d64 e74 f96 g31 h11 i85

Example 2. 7 by 3 Data Matrix Objectq1q2q3 a937 b278 c619 d644 e735 f318 g749

Representability Having obtained POSET ranks, is there an index, say, with a vector w of indicator weights that produces an identical ranking of objects? If so the POSET ranking is representable by a composite index Not always a POSET ranking is representable If it exists we denote it by w # How do we find a w # if one exists? If one exists are there more, leading to equivalent weights in the weight space with common rankings. We reduce the problem of finding w # to a linear programming problem.

Search for an equivalent composite index Let X = (x ij ) be the n by m data matrix with rows arranged in increasing order by the POSET ranking of objects. n = number of objects m = number of indicators Consider an (n-1) by m matrix D = (d ij ), d ij = x ij - x i+1j i = 1, 2, …., n-1 Let w’ = (w 1, w 2, …, w m ) Then an index with weights w exists if and only if there is a solution to maximization or minimization of some suitable linear function of w’s subject to D’w >= 0 together with some additional constraints. To enforce ties some constraints involve strict equalities.

Representability Region If given a ranking is representable then its representability region is the set of all weight vectors each one of which defines a composite indicator that represents the given ranking. Given a ranking its representability region can be obtained by solving the associated linear programming problem with different objective functions. For a data matrix with two or three indicators the representability region can be obtained with simple graphing tools.

Example 1. 9 x 2 Data MatrixRepresentability Region Objectq1q2DBWPR a933 b276 c617 d645 e744 f961 g318 h119 i852

Example 2 7 x 3 Data MatrixRepresentability Region Obj.Idq1q2q3DWBPR a9372 b2785 c6193 d e7354 f318 g7491

Absence of Representability Representability of DBWP ranking is desirable. If DBWP ranking is representable then there is possibly a representability region in which case there is some scope for reconciliation with another index if it becomes necessary. Hence if DBWP ranking is not representable we would like to look for approximate representability. DBWP ranking is approximate representable if there is a composite indicator whose ranking has statistically significant correlation coefficient with DBWP ranking.

Search for Approximate Representable Composite Indicator Relaxing constraint/s in the linear programming problem solving. –By examining output of the linear program solver used one may be able to eliminate some constraints from the linear programming set up and be able to obtain a composite index which is highly correlated with DBWP ranks. Search the weight space for a weight vector with maximum correlation with DBWP ranks. Not enough is known about possibility of success of either of the above two methods nut we have an example for which both methods worked.

Example Approximate Representability 11 x 3 Data Matrix Obj Idq1q2q3DBWPR a b278 9 c d e f738 5 g h j k l7136 4

Microsoft Excel Solver Output Row Number Zw1w2w3Constraint Right Side Solution1.19E E E E E E E E E E E E E E E E E-111

Approximate Representability

Reconciliation DBWP Ranking based on evidence supported by data Stakeholders ranking, if based on subjective basis, may need to be reconciled with DBWP ranking Possibilies –Latter may be in representability or approximate representability region – no reconciliation needed –Latter may be in an equivalency region adjoining representability/approximate representability (RAR) region. Still a difficult situation to handle –Latter may be in a distant equivalency region Harder situation (continued)

Reconciliation Harder situation Suggested solution –Locate equivalency region of stakeholder index –Consider a path from the center of the stakeholder region to the center of RAR region along the path toward the RAR region through intervening regions –At each border region compute correlation between ranking due to stakeholder index at current location and DBWP ranking. –If correlation is significant try to persuade stakeholder to accept the modified composite indicator –If not acceptable it is time to review selection of surrogate indicators, their measurements, data collection

Reconciliation Example Positionw1w2w3Corr.Coeff P Q * R ** S