Download presentation
Presentation is loading. Please wait.
Published byBrenda Casey Modified over 9 years ago
1
Fatemeh Lashkari UNB University May 7 th 2014
2
2 Indexing Semantic Search Semantic Search Architecture Index process Index Maintenance
3
3 Inverted Index Sort-based inversion Single-pass in memory inversion HYB Index Prefix search Autocompletion search Expansion query and faceted search Fast error tolerant search Support ‘’select’’ and ‘’join’’ in database-style
4
Indexing Semantic Search Semantic Search Architecture Index process Index Maintenance 4
5
5 Query: “astronauts walk on moon” http://broccoli.cs.uni-freiburg.de/demos/BroccoliFreebase/
6
6 Indexing Semantic Search Semantic Search Architecture Index process Index Maintenance
7
7 Indexing Query Process Answers of the question Ontology Text Collection
8
8 Indexing Semantic Search Semantic Search Architecture Index process Parsing Index Maintenance
9
9 Preprocessing Stemming Lower case General Motors general motors Remove some of stop words e.g is, do, a, of,.. Annotation text Annotators Machine learning approaches
10
10 Indexing Semantic Search Semantic Search Architecture Index process Parsing Index Structure Index Maintenance
11
11 The fast and efficient index does not need the whole vocabulary of the indexed collection in main memory need to sort postings need merge postings cache efficiently
12
12 Indexing Semantic Search Semantic Search Architecture Index Process Parsing Index Structure Building Index Index Maintenance
13
13 How many index do we need? Index for relation Index for text What is the structure of vocabulary? What is the structure of posting? What are statistic information that a posting contains? e.g apple:
14
14 How to compute score to improve the final result? How to save index? Distribute index Process query parallel Which methods of compression can be used?
15
15 Indexing Semantic Search Semantic Search Architecture Index process Index Maintenance
16
16 Strategies for maintaining index: Merge-based (remerge) In-place Hybrid index update operation Geometric partitioning
17
17 Thank You
18
18 1] Bast, Hannah, and Marjan Celikik. "Fast construction of the HYB index." ACM Transactions on Information Systems (TOIS) 29.3 (2011): 16. 2] Bast, Holger, and Ingmar Weber. "Type less, find more: fast autocompletion search with a succinct index." Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2006 [3]Celikik, Marjan, and Hannah Bast. "Fast single-pass construction of a half-inverted index." String Processing and Information Retrieval. Springer Berlin Heidelberg, 2009. [4] Heinz, S., Zobel, J.: Efficient single-pass index construction for text databases. Jour. of the American Society for Information Science and Technology (2003) [5]Celikik, Marjan, and Holger Bast. "Fast error-tolerant search on very large texts." Proceedings of the 2009 ACM symposium on Applied Computing. ACM, 2009. [6] Bast, Holger, Debapriyo Majumdar, and Ingmar Weber. "Efficient interactive query expansion with complete search." Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, 2007.
19
19 [7] Bast, Hannah, et al. "A case for semantic full-text search." Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM, 2012. [8] Bast, Holger, et al. "ESTER: efficient search on text, entities, and relations." Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2007. [9]Bast, Holger, Fabian Suchanek, and Ingmar Weber. "Semantic Full-Text Search with ESTER: Scalable, Easy, Fast." Data Mining Workshops, 2008. ICDMW'08. IEEE International Conference on. IEEE, 2008. [10] Bast, Hannah, et al. "Broccoli: Semantic full-text search at your fingertips." arXiv preprint arXiv:1207.2615 (2012). [11] Bast, Hannah, and Elmar Haussmann. "Open information extraction via contextual sentence decomposition." Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on. IEEE, 2013. [12] Cheng, Tao, and Kevin Chen-Chuan Chang. "Beyond pages: supporting efficient, scalable entity search with dual-inversion index." Proceedings of the 13th International Conference on Extending Database Technology. ACM, 2010.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.