04/01/20031 Project and Product Selection by He Jiang Department of Management University of Utah April 1 st, 2003.

Slides:

Advertisements

Similar presentations

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.

Advertisements

Critical Reading Strategies: Overview of Research Process

Multi‑Criteria Decision Making

Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.

Introduction to Management Science

Copyright © 2006 Pearson Education Canada Inc Course Arrangement !!! Nov. 22,Tuesday Last Class Nov. 23,WednesdayQuiz 5 Nov. 25, FridayTutorial 5.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Merging Taxonomies. Assertion Creation and maintenance of large ontologies will require the capability to merge taxonomies This problem is similar to.

DBD Workshop, September 2000 Sundar Krishnamurty, Umass-Amherst Sundar Krishnamurty Department of Mechanical and Industrial Engineering University of Massachusetts-Amherst.

Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.

Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.

1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.

Introduction to Management Science

© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.

On Fairness, Optimizing Replica Selection in Data Grids Husni Hamad E. AL-Mistarihi and Chan Huah Yong IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

1 Enviromatics Decision support systems Decision support systems Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.

Chapter 5 Data mining : A Closer Look.

Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.

9-1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Multicriteria Decision Making Chapter 9.

Multicriteria Decision Making

Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.

Overview of the Database Development Process

by B. Zadrozny and C. Elkan

Database Design - Lecture 2

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.

School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.

1 Comparison of Principal Component Analysis and Random Projection in Text Mining Steve Vincent April 29, 2004 INFS 795 Dr. Domeniconi.

Chapter 9 - Multicriteria Decision Making 1 Chapter 9 Multicriteria Decision Making Introduction to Management Science 8th Edition by Bernard W. Taylor.

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

ENM 503 Lesson 1 – Methods and Models The why’s, how’s, and what’s of mathematical modeling A model is a representation in mathematical terms of some real.

Learning from Observations Chapter 18 Through

1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

Topic (vi): New and Emerging Methods Topic organizer: Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Oslo, Norway, September 2012.

Interoperable Visualization Framework towards enhancing mapping and integration of official statistics Haitham Zeidan Palestinian Central.

Externally Enhanced Classifiers and Application in Web Page Classification Join work with Chi-Feng Chang and Hsuan-Yu Chen Jyh-Jong Tsay National Chung.

Tahir Mahmood Lecturer Department of Statistics. Outlines: E xplain the role of sampling in the research process D istinguish between probability and.

1 CHAPTER 2 Decision Making, Systems, Modeling, and Support.

An overview of multi-criteria analysis techniques The main role of the techniques is to deal with the difficulties that human decision-makers have been.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Catalog Integration R. Agrawal, R. Srikant: WWW-10.

NetQuest: A Flexible Framework for Large-Scale Network Measurement Lili Qiu University of Texas at Austin Joint work with Han Hee Song.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Methods of multivariate analysis Ing. Jozef Palkovič, PhD.

Mustajoki, Hämäläinen and Salo Decision support by interval SMART/SWING / 1 S ystems Analysis Laboratory Helsinki University of Technology Decision support.

OPERATING SYSTEMS CS 3502 Fall 2017

Chapter 7. Classification and Prediction

Confidence Intervals.

Introduction to Quantitative Research

A Unifying View on Instance Selection

Objective of This Course

CSc4730/6730 Scientific Visualization

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Utility-Function based Resource Allocation for Adaptable Applications in Dynamic, Distributed Real-Time Systems Presenter: David Fleeman {

Panagiotis G. Ipeirotis Luis Gravano

Comparative Evaluation of SOM-Ward Clustering and Decision Tree for Conducting Customer-Portfolio Analysis By 1Oloyede Ayodele, 2Ogunlana Deborah, 1Adeyemi.

CS/EE/ME 75(a) Nov. 19, 2018 Today: Prelimnary Design Review Homework.

Integrating Taxonomies

Multicriteria Decision Making

Modeling and Analysis Tutorial

Presentation transcript:

04/01/20031 Project and Product Selection by He Jiang Department of Management University of Utah April 1 st, 2003

04/01/20032 Outline On Integrating Catalogs A Hierarchical Constraint Satisfaction Approach to Product Selection for Electronic Shopping Support A Multiple Attribute Utility Theory Approach to Ranking and Selection

04/01/20033 On Integrating Catalogs Rakesh Agrawal and Ramakrishnan Srikant IBM Almaden Research Center

04/01/20034 Summary Problem: integrating documents from different sources into a master catalog. Gaps: Many data sources have their own categorizations; implicit similarity information in these source catalogs may be ignored. Approaches: Naïve Bayes classification Contribution: classification accuracy can be improved by incorporate the implicit similarity information present in these source categorizations

04/01/20035 Problem—Why Integration? B2C shops need to integrate catalogs from multiple vendors ( Amazon); B2B portals merged into one company (Chipcenter & Questlink  eChips); Information portals categorize documents into categories (Google & Yahoo!). Corporate portals Merge intra-company and external information into a uniform categorization

04/01/20036 Problem Identification—Model Building Problem identification: classification problem. Master catalog M with categories C1, C2, …, Cn; Source catalog N with categories S1, S2, …, Sm; Merge documents in N into M.

04/01/20037 Question How to Integrate?

04/01/20038 Straightforward Approach: Completely ignore N’s categorization, put each of N’s product into M’s category according to M’s classification rule.

04/01/20039 Enhanced Approach incorporate the implicit categorization information present in N into M.

04/01/ Assumptions and Limitations M and N may are homogeneous and have significant overlap; M and N use the same vocabularies (Larkey, 1999). Catalog hierarchies is flattened and is treated as a set of categories(Good 1965 & Chakrabarti 1997) Different hierarchy levels (if M>N, can help distinguish categories that M doesn’t have; if N>M, NBHC can be applied.

04/01/ Related Works and Gaps Naïve-Bayes classifiers are accurate and fast(Chakrabarti et al 1997, …), so we choose Bayesian model; Folder systems such as routing(Agrawal et al, 2000,…), action predicting(Maes, 1994 & Payne et al, 1997), query organizing using text clustering(Sahami et al, 1998) and filings transferring(Dolin et al 1999); But none of this systems address the task of merging hierarchies The Athena system includes the facility of reorganizing folder hierarchy into a new hierarchy (Agrawal et al, 2000); But no information from the old hierarchy is used in either building the model or routing the documents.

04/01/ Straightforward Approach

04/01/ Straightforward Approach—Continued

04/01/ Enhanced Bayes Classification

04/01/ Effect of Weight on Accuracy Weight can make difference for a given M and N; Tune set method to select a good value for the weight. in which the document will be correctly classified or will never be correctly classified The highest possible accuracy achievable with the enhanced algorithm is no worse than what can be achieved with the basic algorithm.

04/01/ Experimental Results—Data Sets Used Synthetic catalog: deriving source catalog N from M using different distributions(e.g. Gaussian). Real Catalog: two real-world catalogs that have some common documents; treat the first catalog minus the common documents as M, the remaining documents in the second catalog as N;

04/01/ Experimental Results

04/01/ Experimental Results

04/01/ Experimental Results

04/01/ Experimental Results—Catalog Size

04/01/ Experimental Results—Catalog Size

04/01/ Contributions and Future Research Directions Contributions: enhancing the standard Naive Bayes classification by incorporating the category information of the source catalogs; the highest accuracy of the enhanced technique can be no worse than that can be achieved by standard Naïve Bayes classification. Future research: using other classifiers such as SVM to incorporating the implicit information of N requires further work

04/01/ A Hierarchical Constraint Satisfaction Approach to Product Selection for Electronic Shopping Support Young U. Ryu IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and humans Vol. 29, No. 6, November 1999

04/01/ Summary Problem: proposing a product selection mechanism for electronic shopping support; Approach: hierarchical constraint satisfaction (HCS) approach Gap: simple taxonomy hierarchy(STH) approach is flawed in that the the search is conducted on a single generic product hierarchy; HCS is more powerful and flexible than STH.

04/01/ Simple taxonomy Hierarchy Approach

04/01/ Question 1. How do we search for a sugar-free decaffeinated cola? 2. If there isn’t a cola that satisfy all the requirements, i.e., cola, sugar-free and decaffeinated. what’s your recommendation?

04/01/ Gaps Search is conducted on a single generic product hierarchy; There may exist a product that cannot satisfy all the constraints; A product may be evaluated to be better than another while there is no big differences between these two products.

04/01/ Hierarchical Constraint Satisfaction Approach Constraint Satisfaction: a methodology determining assignments of values to variables that are consistent with given constraint; Hierarchical Constraint Satisfaction: an extension of STH which minimizes the the satisfaction errors of hierarchically organized constraints based on their importance; Value of HCS: can be applied to cases in which there isn’t a solution that is consistent with given constraints due to conflicting constraints.

04/01/ Concepts Introduced Constraint domain transformation: transformation of a Boolean constraint to a arithmetic constraint; Tree domain: is one whose elements are structured as a tree; thus can be handled more flexibly; Indifference interval: overcome a shortcoming of hierarchical reasoning when the difference between two alternatives is small;

04/01/ Constraint Satisfaction Error Measures the degree of satisfaction of an arithmetic constrain c by the constraint satisfaction error function for Boolean constraint, transform them into arithmetic constraints; e.g.

04/01/ Hierarchical reasoning and indifference interval

04/01/ Constraint Hierarchies

04/01/ Example Shopping for wipes products using hierarchical constraint satisfaction approach. Each product is described by the following attributes: Cost: cents per sheet Add-on materials: “baking soda”, “aloe vera”, …; Strength: measured by pressure(psi) that breaks a sheet; Dispenser type: “box”, “pop-up”; Added artificial scent: unscented, natural aloe scented, natural jasmine scented and chemical perfume scented; Product purpose: “general purpose”, “diaper change”.

04/01/ Example—Result

04/01/ Contributions and Future Research Directions Contribution: the product search mechanism is viewed as a satisfaction problem of hierarchically organized constraints over product attributes, thus it is more powerful and flexible than product selection based on a single product taxonomy hierarchy. Future research: Purchasing requirement specification or constraint hierarchy elicitation; complete prototype implementation of the HCS approach; actual purchasing/sales transaction based on speech –act theory, illocutionary logic and inter-organizational activity coordination.

04/01/ A Multiple Attribute Utility Theory Approach to ranking and Selection John Butler, Douglas J. Morrice and Peter W. Mullarkey Management Science, Vol. 47, No. 6, June 2001

04/01/ Summary Problem: developing a ranking and selection procedure for making comparison of systems that have multiple performance measures; Approach: combining Multiple Attribute Utility Theory (MAUT) and statistical ranking and selection (R&S) using indifference zone; Gaps: costing approach is flawed in that accurate cost data may not be available, and it may be difficult to measure performance using costs.. Advantages: rigorous; close to business practice; simpler to implement; can estimate the number of simulations required; can assess the relative importance of criteria

04/01/ Gaps Most of the R&S literature focused on procedures that reduce the multivariate performance measures to a scalar performances measure problem, but these procedures may have some disadvantages, e.g. accurate cost data may not be available; it maybe difficult to accurately attach a dollar value to intangible variables; Current techniques may require a complicated step of estimating a covariance matrix(Gupta & Panchapakesan 1979); Previous work doesn’t provide an approach to estimate the number of simulations required to select the best configurations with a high level of probability(Andijani 1998, Kim & Lin 1999). Previous work lacks a trade-off mechanism that allows the decision maker to combine disparate performance measures.

04/01/ Assumptions Decision maker’s preferences are accurately represented ( Clemen 1991, Keeney & Raiffa 1976); Performance measures that is converted to “utils” can be converted to meaningful unit by choosing an invertible utility function; There is a indifference zone for the decision maker on all the performance measures;

04/01/ General Outline of the Procedure

04/01/ Multilinear Utility Function

04/01/ Multiplicative MAU Model

04/01/ Additive MAU Model If mutual utility additive independent, then Example for additive independence:

04/01/ Single Attribute Utility Function Used Methods for assigning weights: trade-off method; analytical hierarchy process (AHP).

04/01/ Question What’s the benefit of using this function?

04/01/ R&S Experimental Set-up Correct Selection (CS): the R&S procedure accurately identifies the configuration with largest expected utility. Two stage indifference zone procedure for R&S.

04/01/ Selection of A Utility Exchange Approach Table 1 Alternatives by Measures Matrix for Car Selection Table 2 Equivalent Hypothetical Cars

04/01/ Question Again Does it mean that the 20 horsepower is worth $1,200?

04/01/ Selection of

04/01/ Establishing the Indifference Zone Curve dividing the indifference and preference zone:

04/01/200351

04/01/ Example:

04/01/ Application of the Procedure—Case Description Case example: Land Seismic Survey; Performance measures: survey cost; survey duration; utilization of the four crews; Relationship of the crews:

04/01/ Application of the Procedure—Results

04/01/ Application of the Procedure—Results

04/01/ Application of the Procedure—Sensitivity Analysis to Weight

04/01/ Contributions and Future Research Directions Contribution: provides a formal procedure that can be applied to realistic problems; presents a scalar performance measure that can summarize performance on multiple criteria, including nonlinear preference functions and the relative importance of the measures; Future research: combine MAU theory with the work of Chen et al; extend the MAU methodology with Chick and Inoue’s work to include their Bayesian technique and relieve some of the computational burden of all R&S procedure; combine the work in this paper with R&S procedures designed facilitate variance reduction through the use of common random numbers (See Matejcik and Nelson 1995 and Goldman and Nelson 1998).