Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Information Extraction1 Concept Detection Amir R. Tahamtan.

Similar presentations


Presentation on theme: "Web Information Extraction1 Concept Detection Amir R. Tahamtan."— Presentation transcript:

1 Web Information Extraction1 Concept Detection Amir R. Tahamtan

2 Web Information Extraction2 Concept Detection  Goals: discover knowledge, find associations.  Discussed Techniques: Concept Mining, Document Clustering  Related works: Keyword-based search, Resource discovery, Wrapper information extraction, Web queries, User preferences

3 Web Information Extraction3  Fu Y., Bauer T., Mostafa J., Palakal M., and Mukhopadhyay S (2002): Concept Extraction and Association from Cancer Literature. Proceedings of the 4th international workshop on Web information and data management. McLean, Virginia, USA.  Introduction  Algorithm  Experiments & Conclusion

4 Web Information Extraction4  Token discovery tf.idf : W ik = t ik X log(N/n k )  LSA  Data representation as a term-doc matrix  Factoriziation : X tx0 = T txr.S rxr. O rxo  Approximation : X tx0 ˜ X ´ tx0 = T txk.S kxk. O kxo  Token Association Discovery The Algorithm

5 Web Information Extraction5

6 6

7 7  Liu B., Chin CW., Ng HAT (2003): Mining Topic-Specific Concepts and Definitions on the Web. Proceedings of the twelfth international conference on World Wide Web. Budapest, Hungary.  Introduction  The proposed Technique  System Architecture  Experiments & Conclusion

8 Web Information Extraction8 The Proposed Technique  Algorithm Weblearn (T)  Subtopic Discovery  Definition Finding  Dealing with Ambiguity  Mutual Reinforcement

9 Web Information Extraction9 System Architecture

10 Web Information Extraction10

11 Web Information Extraction11 THANK YOU !


Download ppt "Web Information Extraction1 Concept Detection Amir R. Tahamtan."

Similar presentations


Ads by Google