Download presentation
Presentation is loading. Please wait.
Published byRose Arnold Modified over 6 years ago
1
Word AdHoc Network: Using Google Core Distance to extract the most relevant information
Presenter : Wei-Hao Huang Authors : Ping-I Chen, Shi-Jen Lin KBS 2010
2
Outlines Motivation Objectives Methodology Experiments Conclusions
Comments
3
Motivation Most previous research methods need predictive models, which are based on the training data or Web log of the users’ browsing behaviors. Those are complexity and the keyword extraction methods are limited to certain areas.
4
Objectives To present a new algorithm called ‘‘Word AdHoc Network’’ (WANET). This method needs no pre-processing, and all the executions are real-time. To extract any keyword sequence from various knowledge domains. Document WANET System Relevant Documents
5
Methodology Word AdHoc Network System Architecture
1-gram filtering method Part-of-speech Length of the words Number of Google search results Google Core Distance Hop-by-Hop Routing algorithm PageRank algorithm BB’s graph-based clustering algorithm
6
WANET System Architecture
7
1-gram filtering method
Part-of-speech NN (common noun, singular), NP (proper noun), DT (determiner), or JJ (adjectives) Length of the words At least 3 word Number of Google search results
8
Google Core Distance The original algorithm NGD The New algorithm GCD
9
Hop-by-Hop Routing Algorithm
PageRank algorithm
10
Hop-by-Hop Routing Algorithm
BB’s graph-based clustering algorithm BB score = 1 6
11
Hop-by-Hop Routing Algorithm
12
Experiments Time variance effect of the Google search results
Execution time Precision and recall rate Top-k search results analysis Dataset: To select four knowledge domains from the Elsevier Web site, and to chose the top 25 most-downloaded papers in each journal.
13
Time variance effect of the Google search results
To use spearman’s footrule to compare the sequences that were extracted by those two algorithm.
14
Execution time
15
Precision and recall rate
16
Top-k search results analysis
17
Conclusions To propos a new system that can extract the most important keyword sequence to represent a document To help users automatically find relevant documents or Web pages. Future work To hope it can used in a mobile device or an e-book.
18
Comments Advantages Applications
To extract the most important keyword sequence. Applications Information retrieval
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.