Download presentation
Presentation is loading. Please wait.
Published byAllison Richards Modified over 8 years ago
1
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Electricity Based External Similarity of Categorical Attributes Advisor : Dr. Hsu Presenter : Zih-Hui Lin Author :Christopher R. Palmer 1 and Christos Faloutsos 2 PAKDD
2
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Motivation Objective Algorithm Experiments Conclusions Outline
3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation There is a great deal of categorical data stored in databases but it is difficult to define similarity between categorical values. The only previously proposed external similarity function is ad-hoc while REP is theoretically grounded.
4
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective The goal of this research is effectively use the previously theory.
5
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Introduction I=V/R;C=1/R P(x→* y/S)= ;C(x,y)=w(x,y); R(x,y)=1/w(x,y) w(x,y)/Cy C 為傳導性,假設為權重值, 傳導性愈高值愈大。 值愈高, x 到 y 再回到 x 的機率愈 高。 Escape probability is defined as the probability that a walk started from x will reach S before it returns to x. xy
6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 REP algorithm s 1. Adding sink nodes to a graph 2. S REP is Symmetric 3. S REP as Electrical Currents 4. Kirchhoff Relaxation Algorithm for Voltages 1 ……………………. S1S1 S2S2 SiSi S…S… 0 5. Distance function C G0 (x) + w(x,s)=total Old: New: X X X X X X X S…S… S…S… ……
7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Experiments-Clustering
8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Experiments-Classification Table 1. Error rates show REP has classification error similar to C4.5 and better than NN with hamming distance
9
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Experiments -Sensitivity to sink _ p and scalability 0.85
10
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Conclusions We used this node similarity function to provide an external similarity function called S REP. ─ better than the best existing external distance function, ─ built upon a theoretical foundation (the existing approach is not), ─ allows cross attribute similarity computations which allows ─ excellent nearest neighbour classification. ─ provides excellent scalability with input size, and
11
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 My opinion Advantage: 結合其他領域知識 ….. Disadvantage: Apply: 自動建立概念階層 ……
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.