Download presentation
Presentation is loading. Please wait.
Published byAddison Lackland Modified over 10 years ago
1
Similarity Evaluation Techniques for Filtering Problems ? Vagan Terziyan University of Jyvaskyla vagan@it.jyu.fi
2
Evaluating Distance between Various Domain Objects and Concepts - one of the basic abilities of an intelligent agent Are these two the same? … No ! The difference is equal to 0.234
3
Contents 4 Goal 4 Basic Concepts 4 External Similarity Evaluation 4 An Example 4 Internal Similarity Evaluation 4 Conclusions
4
Reference A Similarity Evaluation Technique for Data Mining with an Ensemble of Classifiers Puuronen S., Terziyan V., A Similarity Evaluation Technique for Data Mining with an Ensemble of Classifiers, In: A.M. Tjoa, R.R. Wagner and A. Al- Zobaidie (Eds.), Proc. of the 11th Intern. Workshop on Database and Expert Systems Applications, IEEE CS Press, Los Alamitos, California, 2000, pp. 1155-1159. http://dlib.computer.org/conferen/dexa/0680/pdf/06801155.pdf
5
Goal 4 The goal of this research is to develop simple similarity evaluation technique to be used for social filtering 4 Result of social filtering here here is prediction of a customers evaluation of certain product based on known opinions about this product from other customers
6
Basic Concepts: Virtual Training Environment (VTE) VTE is a quadruple: D is the set of goods D 1, D 2,..., D n in the VTE; C is the set of evaluation marks C 1, C 2,..., C m, that are used to rank the products; S is the set of customers S 1, S 2,..., S r, who select evaluation marks to rank the products; P is the set of semantic predicates that define relationships between D, C, S
7
Basic Concepts: Semantic Predicate P
8
Problem 1: Deriving External Similarity Values
9
External Similarity Values External Similarity Values (ESV): binary relations DC, SC, and SD between the elements of (sub)sets of D and C; S and C; and S and D. ESV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)
10
Problem 2: Deriving Internal Similarity Values
11
Internal Similarity Values Internal Similarity Values (ISV): binary relations between two subsets of D, two subsets of C and two subsets of S. ISV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)
12
Why we Need Similarity Values (or Distance Measure) ? 4 Distance between products is used to advertise the customers a new product based on evaluation of already known similar products 4 distance between evaluations is necessary to estimate evaluation error when necessary, e.g. in the case of adaptive filtering technologies used 4 distance between customers is useful to evaluate weights of all customers when necessary, e.g. to be able to integrate their opinions by weighted voting.
13
Deriving External Relation DC: How well evaluation fits the product Customers Products Evaluation marks
14
Deriving External Relation SC: Measures customers competence in the use of evaluation marks 4 The value of the relation (S k,C j ) in a way represents the total support that the customer S k obtains selecting (refusing to select) the mark C j to evaluate all the products.
15
Example of SC Relation Customers Products Evaluation marks
16
Deriving External Relation SD: Measures customers competence in the products 4 The value of the relation (S k,D i ) represents the total support that the agent S k receives selecting (or refusing to select) all the solutions to solve the problem D i.
17
Example of SD Relation Products Evaluation marks Customers
18
Normalizing External Relations to the Interval [0,1] n is the number of products m is the number of evaluation marks r is the number of customers
19
Competence of a customer DiDi Conceptual pattern of goods features Conceptual pattern of evaluation marks definitions Goods Evaluation marks CjCj Customer Competence in the goods Competence in the evaluation marks
20
Customers Evaluation: competence quality in Products
21
Customers Evaluation: competence quality in evaluation marks use
22
Quality Balance Theorem The evaluation of a customers competence (ranking, weighting, quality evaluation) does not depend on the competence area virtual world of products or conceptual world of evaluation marks because both competence values are always equal.
23
Proof...
24
An Example 4 Let us suppose that four customers have to evaluate three products from virtual shop using five different evaluation marks available. 4 The customers should define their selection of appropriate mark for every product. 4 The final goal is to obtain a cooperative evaluation result of all the customers concerning the quality of products.
25
C set (evaluation marks) in the Example Evaluation marks Notation Nicely designedC 1 ExpensiveC 2 Easy to useC 3 ReliableC 4 SafeC 5
26
S (customers) Set in the Example Customers IDs Notation FoxS 1 WolfS 2 CatS 3 HareS 4
27
D (products) Set in the Example D 2 - Nokia Communicator 9110 D 1 - Ultra Cast Spinning Reel D 3 - iGrafx Process Management Software
28
Evaluations Made for the Good Reel D 1 P(D,C,S) C 1 C 2 C 3 C 4 C 5 S 1 1-1-10-1 S 2 0 + -1 ** 0 ++ 1 * -1 *** S 3 00-110 S 4 1-1001 Customer Wolf prefers to select mark Reliable * to evaluate Reel and it refuses to select Expensive ** or Safe ***. Wolf does not use or refuse to use the Nicely designed + or Easy to use ++ marks for evaluation.
29
Evaluations Made for the Good Communicator D 2 PC 1 C 2 C 3 C 4 C 5 S 1 -10-101 S 2 1-1-100 S 3 1-1011 S 4 -10010
30
Evaluations Made for the Good Software D 3 PC 1 C 2 C 3 C 4 C 5 S 1 101-10 S 2 010-11 S 3 -1-11-11 S 4 -1-11-11
31
Example: Calculating Value DC 3,4 D 3 PC 1 C 2 C 3 C 4 C 5 S 1 101-10 S 2 010-11 S 3 -1-11-11 S 4 -1-11-11
32
Resulting DC relation
33
Normalized and Thresholded DC relation 010.50.250.75 01
34
Result of Cooperative Goods Evaluation Based on DC Relation D 2 is reliable, safe, not expensive, but not easy to use D 1 is nicely designed, reliable, not expensive, but not easy to use D 3 is easy to use, safe, but not reliable
35
An Example: Calculating Value SD 1,1
36
An Example: Calculating Value SC 4,4
37
Resulting SD and SC relations
38
… or similar to Software. Foxs evaluations should be rejected if they concern goods similar to Communicator Evaluations obtained from the customer Fox should be accepted if he evaluates goods similar to Reels... Normalized and Thresholded SD relation Fox Wolf Cat Hare
39
Only evaluation from the customer Cat can be accepted if it concerns goods similar to Communicator Normalized and Thresholded SD relation Fox Wolf Cat Hare All four customers are expected to give an acceptable evaluations concerning Software related goods
40
… or reliability of a good. Evaluation obtained from the customer Fox should be accepted if it concern usability (easy to use) of a good... Foxs evaluations should be rejected if they concern design of goods Normalized and Thresholded SC relation Fox Wolf Cat Hare Nicely designed Expensive Easy to use Reliable Safe
41
Problem 2: Deriving Internal Similarity Values
42
Internal Similarity Values Internal Similarity Values (ISV): binary relations between two subsets of D, two subsets of C and two subsets of S. ISV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)
43
Deriving Internal Similarity Values Via one intermediate setVia two intermediate sets
44
Internal Similarity for Customers: Goods-based Similarity Goods Customers
45
Internal Similarity for Customers: Evaluation marks-Based Similarity Evaluation marks Customers
46
Internal Similarity for Customers: Evaluation marks-Goods-Based Similarity Customers Evaluation marks Goods
47
Internal Similarity for Evaluation Marks Customers-based similarity Goods-based similarity Goods-customers-based similarity
48
Internal Similarity for Goods Customers-based similarity Evaluation marks-based similarity Evaluation marks-customers-based similarity
49
Normalized and Thresholded DD C relation similar neutral different
50
Conclusion 4 Discussion was given to methods of deriving the total support of each binary similarity relation. This can be used, for example, to derive the most supported goods evaluation and to rank the customers according to their competence 4 We also discussed relations between elements taken from the same set: goods, evaluation marks, or customers. This can be used, for example, to divide customers into groups of similar competence relatively to the goods evaluation environment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.