Efficient Algorithm for Web Search Query Reformulation Using Genetic Algorithm A Paper Presentation Vikram Singh Dept. of Computer Engineering , National Institute of Technology, Kurukshetra, Haryana-136119, India Email : vishalsheokand007@gmail.com , viks@nitkkr.ac.in
Introduction Query Reformulation Types of Query Reformulation Process of altering a given query to improve search or retrieval performance. Types of Query Reformulation Query log Analysis Genetic Algorithm Approach Other nature based optimization techniques (ACO) ICCIDM,2015
Query Reformulation Query answering in web based search engine is critical and key issue in dynamic computing. Query reformulation is interactive and iterative. About 28% queries are generalization of previously submitted query of web searches. About 52% of web users reformulate initial queries with anticipation of better results. ICCIDM, 2015
Reformulation Process Query reformulation involves : Query Submission Query Analysis Relevant data sources identification Query reformulation Query execution Answers and integration Query result presentation ICCIDM, 2015
Genetic Algorithm Genetic Algorithm Query is considered as individual (chromosome for GA) Generates optimal path on term association graph Reformulation based on terms in optimal path. ICCIDM, 2015
GA based Query Reformulation Search query q Retrieve top k documents Loading query keywords into Ternary search tree Construction of term graph v Query suggestion Extract keywords from top-k paths Select best path Implement GA ICCIDM, 2015
Associated Term Graph Every Keyword of Ternary search tree has an associated term Graph. Node of graph is the document which contains the keyword. Edges between nodes are based on similarity value. ICCIDM, 2015
Applying Genetic Algorithm Each Graph traversal is encoded as chromosome. Selecting appropriate path based on threshold similarity value. Crossover and mutation . ICCIDM, 2015
Query Reformulation GA converges with set of optimal paths. A path with higher similarity value is preferred. Distinct words are extracted from optimal paths for query suggestion. ICCIDM, 2015
Precision vs Recall ICCIDM, 2015
Paper Contributions GA based query reformulation strategy. Optimization performance analysis among the other nature inspired optimization techniques. ICCIDM, 2015
Conclusion Genetic Algorithm based query reformulation. Build upon Ternary search tree and term graph extracted from documents of initial user query. Optimal plans are selected by GA based approach according to similarity values. Optimization performance of various approach is also discussed.(ACO and PSO) GA emerges as competitively effective on the query reformulation. ICCIDM, 2015
References Manning, C.D., Raghavan, P., Schutze, H.: Introduction to information Retrieval. Cambridge University press (2008) Fonseca, B.M., Golgher, P.B., de Moura, E.S., Possas, B., Ziviani, N.: Discovering Search engine related queries using association rules. Journals of Web Engineering 2(4), pp. 215-227 (2003) Jeh, G., Widom, J.: Simrank: A Measure of structural-context similarity. In: Proc. 8th ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining, pp.538-543, (2002) Efthimiadis, E.N.: Query Expansion. In: Annual Review of Information Systems and Technology, vol. 31, pp.121–187, (1996) Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query recommendation using query logs in search engines, In: EDBT, pp.588- 596, (2004) Jansen, B.J., Spink, A.: Real life, real users, and real needs: a study and analysis of user queries on the web, In: Information Processing and Management, pp.207-227, (2000) Kelly, D., Gyllstrom, K., Bailey, E.W.: A comparison of query and term suggestion features for interactive searching. In: Proc. SIGIR, pp.371–378, (2009) Chirita, P.A., Firan, C.S., Nejdl, W.: Personalized Query Expansion for the Web, In: Proc. 30th Intl. ACM SIGIR Conf. Research and Development in Information Retrieval, pp.07–14, (2007) Cui, H., Wen, J.R., Nie, J.Y., Ma, W.-Y.: Query Expansion by Mining User Logs, In: IEEE Trans. Knowledge and Data Engineering, pp.829-839, (2003) ICCIDM, 2015
References(cont.) Jones, R., Rey, B., Madani, O., Greiner, W.: Generating Query Substitutions. In: Proc. 15th Intl. ACM Conf. World Wide Web, pp. 387–396 (2006) Kraft, R., Zien, J.: Mining Anchor Text for Query Refinement. In: Proc 13th ACM Intl. Conf. World Wide Web, pp. 666–674 (2004) Yin, Z., Shokouhi, M., Craswell, N.: Query Expansion Using External Evidence. In: Advances in Information Retrieval. Springer, Heidelberg, pp. 362-374,(2009) Craswell, N., Szummer, M.: Random Walks on the Click Graph. In: Proc. 30th Annual Intl. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 239–246 (2007) Agichtein, E., Brill, E., Dumais, S.: Improving Web Search Ranking by In-corporating User Behaviour Information. In: Proc. 29th ACM SIGIR Intl. Conf. Research and Development in Information Retrieval, pp. 19–26 (2006) Wang, X., Zhai, C.: Learn from Web Search Logs to Organize Search Results. In: Proc. 30th ACM SIGIR Intl. Conf. Research and Development in Information Retrieval, pp. 87–94 (2007) Dignum, S., Kruschwitz, U., Fasli, M., Kim, Y., Song, D.: In-corporating Seasonality into Search Suggestions Derived from Intranet Query Logs. In: Proc. IEEE/ACM Intl. Conf. Web Intelligence and Intelligent Agent Technology, pp. 425–430 (2010) Jones, R., Rey, B., Madani, O., and Greiner, W. Generating query substitutions. In WWW ‘06, pp. 387-396,(2006) Mitra, M., Singhal, A., and Buckley, C.: Improving automatic query expansion. In SIGIR ‘98, pp. 206-214, (1998) Jones, R., Rey, B., Madani, O., and Greiner, W.: Generating query substitutions. In WWW ‘06, pp. 387-396, (2006) ICCIDM, 2015
Thank You ! ICCIDM, 2015