Web Information Extraction1 Concept Detection Amir R. Tahamtan.

Web Information Extraction1 Concept Detection Amir R. Tahamtan

Web Information Extraction2 Concept Detection  Goals: discover knowledge, find associations.  Discussed Techniques: Concept Mining, Document Clustering  Related works: Keyword-based search, Resource discovery, Wrapper information extraction, Web queries, User preferences

Web Information Extraction3  Fu Y., Bauer T., Mostafa J., Palakal M., and Mukhopadhyay S (2002): Concept Extraction and Association from Cancer Literature. Proceedings of the 4th international workshop on Web information and data management. McLean, Virginia, USA.  Introduction  Algorithm  Experiments & Conclusion

Web Information Extraction4  Token discovery tf.idf : W ik = t ik X log(N/n k )  LSA  Data representation as a term-doc matrix  Factoriziation : X tx0 = T txr.S rxr. O rxo  Approximation : X tx0 ˜ X ´ tx0 = T txk.S kxk. O kxo  Token Association Discovery The Algorithm

Web Information Extraction5

7  Liu B., Chin CW., Ng HAT (2003): Mining Topic-Specific Concepts and Definitions on the Web. Proceedings of the twelfth international conference on World Wide Web. Budapest, Hungary.  Introduction  The proposed Technique  System Architecture  Experiments & Conclusion

Web Information Extraction8 The Proposed Technique  Algorithm Weblearn (T)  Subtopic Discovery  Definition Finding  Dealing with Ambiguity  Mutual Reinforcement

Web Information Extraction9 System Architecture

Web Information Extraction10

Web Information Extraction11 THANK YOU !

Web Information Extraction1 Concept Detection Amir R. Tahamtan.

Similar presentations

Presentation on theme: "Web Information Extraction1 Concept Detection Amir R. Tahamtan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Information Extraction1 Concept Detection Amir R. Tahamtan.

Similar presentations

Presentation on theme: "Web Information Extraction1 Concept Detection Amir R. Tahamtan."— Presentation transcript:

Similar presentations

About project

Feedback