Scalable Image Annotation with ConceptRank Petra Budíková, Michal Batko, Pavel Zezula.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Linked data: P redicting missing properties Klemen Simonic, Jan Rupnik, Primoz Skraba {klemen.simonic, jan.rupnik,
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Creating a Similarity Graph from WordNet
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
A Framework for Ontology-Based Knowledge Management System
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 21: Link Analysis.
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
Link Analysis, PageRank and Search Engines on the Web
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Presented by Zeehasham Rasheed
1 Discovering Unexpected Information from Your Competitor’s Web Sites Bing Liu, Yiming Ma, Philip S. Yu Héctor A. Villa Martínez.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Information Retrieval
Chapter 5: Information Retrieval and Web Search
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
Content-Based Image Retrieval
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
CS315 – Link Analysis Three generations of Search Engines Anchor text Link analysis for ranking Pagerank HITS.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
1 Query Operations Relevance Feedback & Query Expansion.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Chapter 6: Information Retrieval and Web Search
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ
1 CS 430: Information Discovery Lecture 5 Ranking.
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
Julián ALARTE DAVID INSA JOSEP SILVA
Search Engines and Link Analysis on the Web
Information Retrieval
CS246: Information Retrieval
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Presentation transcript:

Scalable Image Annotation with ConceptRank Petra Budíková, Michal Batko, Pavel Zezula

Slide 2 Outline  Search-based annotation  Motivation  Problem formalization  Challenges  ConceptRank  Idea  Semantic network construction  PageRank and ConceptRank  Image annotation with ConceptRank  MUFIN Image Annotation  Framework description  Current implementation and parameters  Examples  Experimental evaluation  Future work

What and why?

Slide 4 Motivation  What is in the image?  Why do I care?  Keyword-based image retrieval  Impaired users  Data summarization  Scientific data classification …… Yellow flower Flower, yellow, dandelion, detail, close- up, nature, plant, beautiful Taraxacum officinale The first dandelion that bloomed this year in front of the White House. nature dandelion

Slide 5 Problem formalization  The annotation task is defined by a query image I and a vocabulary V of target concepts  The annotation function f A assigns to each concept c ∈ V a value from that expresses the probability of the concept c being relevant for I  Depending on the application, only a subset of V can be returned to the user  a fixed number of the most probable concepts  concepts with probability higher than a given threshold  some advanced selection of interesting concepts V = {flower, animal, person, building}

Slide 6 How can we describe the image? Option 1: Classifiers Option 2: Search-based approach PrinciplesLearning phase: use reliable training data to create classifiers for selected concepts Annotation phase: run classifiers Learning phase: none Annotation phase: similarity search over annotated data + postprocessing Main advantages  mature technologies available (e.g. neural networks)  fast  high precision and recall  reducing the reliance on cleanly labeled data, utilization of web data  no costly learning phase, annotation phase can be easily adjusted to user’s preferences  scalability w.r.t. vocabulary size Use casesAnnotations with fixed vocabulary and reliable training data  identification of people  classification of cancer cells  … Annotations with open/adaptable vocabulary  proposing keyword annotations for web image databases – need to be rich, adapt to the changing vocabulary of users Option 1: Classifiers Option 2: Search-based approach PrinciplesLearning phase: use reliable training data to create classifiers for selected concepts Annotation phase: run classifiers Learning phase: none Annotation phase: similarity search over annotated data + postprocessing Main advantages  mature technologies available (e.g. neural networks)  fast  high precision and recall  reducing the reliance on cleanly labeled data, utilization of web data  no costly learning phase, annotation phase can be easily adjusted to user’s preferences  scalability w.r.t. vocabulary size Option 1: Classifiers Option 2: Search-based approach PrinciplesLearning phase: use reliable training data to create classifiers for selected concepts Annotation phase: run classifiers Learning phase: none Annotation phase: similarity search over annotated data + postprocessing Option 1: Classifiers Option 2: Search-based approach

Slide 7 Search-based approach: basic scheme V = {flower, animal, person, building} Annotated image collection Content-based image retrieval Similar annotated images Yellow, bloom, pretty Meadow, outdoors, dandelion Mary’s garden, summer Text processing Semantic resources Selection of the final annotation flower Candidate keywords with probabilities/scores Plant 0.3 Flower 0.3 Garden 0.15 Animal 0.05 Human 0.1 Park 0.1

Slide 8 Search-based approach: challenges  Selection and preprocessing of underlying database of annotated images  Size vs. quality  Effective and efficient image search  Descriptors, indexing technique  Image search results processing  Baseline: word cloud  Advanced: semantic analysis, annotation with hierarchic structure  Selection of output  (user?)selected level of the hierarchic structure

ConceptRank

Slide 10  Baseline word cloud solution  ???  What would a person do?  Search for semantic connections between candidate keywords  Flowers bloom; dandelion is a flower; there are usually flowers in a garden; …  Based on the connections, estimate probabilities of vocabulary terms  “Flower” is rather likely Idea Content-based image retrieval ? V = {flower, animal, person, building} Similar annotated images Yellow, bloom, pretty Meadow, outdoors, dandelion Mary’s garden, summer

Slide 11  What can the computer do?  Search for semantic connections between candidate keywords?  Yes! Ontologies, WordNet, image dataset statistics, web, …  Based on the connections, estimate probabilities of vocabulary terms?  Yes! Based on the connections, add new candidates and/or adjust the score of existing candidates  So, lets try it!  Tasks:  find a suitable source of semantic information  propose an algorithm that uses the selected resource to discover semantic connections between candidate concepts and performs score recomputation  We want a generic and theoretically sound solution Idea (cont.) ConceptRank

Slide 12 ConceptRank overview  Let us asume we have some semantic resource S that contains  Semantic objects  Relationships between semantic objects  Mapping from English words to semantic objects  For ConceptRank, we need to  Transform the input keywords into semantic objects from S  Lets call the result “initial candidate objects”  Retrieve relationships between candidate objects and if suitable, add new candidate objects  We need a suitable representation for this: semantic networks  Compute the probability of candidate objects  The actual ConceptRank algorithm

Slide 13  Graph representation of semantic relationships  Nodes: candidate objects  Node probability: current probability of the respective candidate concept  Edges: relationships between candidate objects  Edge weight: “relevance transfer” capacity  the weight of edge from A to B expresses the ratio of probability which node A contributes to node B Semantic network for annotations dog cat animal mouse computer keyboard

Slide 14 Building the semantic network Input: initObjectsWithProb – set of initial objects with probabilities, S - semantic resource, rels – set of interesting relationships Output: semanticNet – the semantic network begin queue <- initObjectsWithProb.getObjects(); for (o : queue) do semanticNet.addNode(o); queue.remove(o); for (r : rels) do for (o2 : S.getConnectedObjects(o,r)) do if (semanticNet.contains(o2)) then semanticNet.addEdge(o,o2,r,computeWeight(r,…)); else if (r.isExpandingRel) then queue.add(o2); semanticNet.addNode(o2); semanticNet.addEdge(o,o2,r,computeWeight(r,…)); fi done end

Slide 15 ConceptRank algorithm  Task: Using the probabilities of initial concepts (which were obtained from previous annotation phases) and the semantic network, compute the probability of each node in the network  Observations:  The nodes in the network mutually influence each other’s probability  The computation of node probabilities needs to be an iterative process  Goal: theoretically sound algorithm that finds a balanced state of the iterative process  Inspiration: Google PageRank algorithm dog cat animal mouse computer keyboard dog cat animal mouse computer keyboard

Slide 16 PageRank  Input: Web pages and links represented in a graph  Output: Importance score of pages  Algorithm idea: In its simplest form, PageRank is a solution to the recursive equation “a page is important if important pages link to it.”  The importance of any node is computed as the probability that this node is reached by a random surfer who starts in an arbitrary node of the network graph and moves for a long time.  Network graph construction:  Pages are represented by nodes, hyperlinks by oriented edges.  For each node in the graph, the sum of weights of all outgoing edges is 1. A C B 0.5 1

Slide 17 PageRank (cont.)  Some math behind:  Since the probability of reaching a node depends solely on the probabilities of referencing nodes, the random surfer model is a Markov process.  For Markov processes, it is known that the distribution of the surfer approaches a limiting distribution, provided two conditions are met:  the graph is strongly connected (it is possible to get from any node to any other n.)  there are no dead ends (nodes that have no outgoing edges)  To meet these conditions, the random surfer can perform random restarts – with a probability P restart, he can restart at any moment in any node  Computation of scores: eigenvector computation over the matrix representation of the adjusted graph P restart =0.3 A C B A C B A C B

Slide 18 ConceptRank vs. PageRank  Input:  PageRank: web pages and hyperlinks  ConceptRank: candidate concepts and semantic links  Output:  PageRank: importance score of pages  ConceptRank: importance score of candidate concepts  Similarities:  We have nodes and links that can be used to form a graph/network  The network can be modelled as a Markov process  The random walk intuition makes sense for both problems  Random walk with internet: simulates randomly surfing user  Random walk with keywords: simulates user’s thinking while looking for relevant concepts  Differences:  For ConceptRank, we want to consider initial probabilities associated with nodes

Slide 19 Adaptation of initial probabilities into the model  Random restarts will not be uniformly random  Instead, the probability that the walk will restart in a given node will correspond to the initial probability of that node  The initial probability is determined by previous steps of the annotation process  For concepts found among the keywords of similar images, the initial probability corresponds to the frequency of the concept  For concepts that were added during the semantic network building, the initial probability is 0 dog cat animal mouse computer keyboard dog cat animal mouse computer keyboard

Slide 20 ConceptRank algorithm Input: initObjectsWithProb – initial concepts and their probabilities, semanticNet – the semantic network, rels – selected relationships and their weights, restartProb – probability of random surfer restart Output: nodeProbs – probabilities of network nodes begin //construct the restart vector and matrix restartVector <- constant vector of 0 values; for (n : semanticNet.getNodes()) do if (initObjectsWithProb.contains(n)) then restartVector[semanticNet.indexOf(n)] <- initObjectsWithProb.get(n); fi done restartM <- unityVector*restartVector; // construct the transition matrix, normalize, solve dead ends transitionM <- new Matrix; for (r : rels.getRelationshipTypes()) do relM = constructTypeMatrix(semanticNet.getNodes,semanticNet.getEdges(r)); transitionM.add(relM*rels.getWeight(r)); done transitionM.normalize(); for (i=0; i<transitionM.getColumnDimension(); i++) do if (transitionM.getColumn(i).getSum() == 0) then transitionM.replaceColumn(i, restartVector); fi done // compute the eigenvector completeMatrix <- (1-restartProb)*transitionM + restartProb*restartM; nodeProbs <- completeMatrix.getPrincipalEigenvector(); end

Slide 21 Efficiency issues  For larger sets of similar images, the number of initial keywords and subsequentially the number of nodes in the network may get high (1000+)  Costly construction of the semantic network  Costly computation of the ConceptRank  Therefore, approximations can be used  For semantic network construction: limiting the number of initial nodes  For ConceptRank computation: limited number of multiplications by the transfer matrix instead of the exact mathematic computation of the eigenvector  Approximation used by Google, known to work very well

Putting theory to use

Slide 23 The basic annotation scheme again V = {flower, animal, person, building} Annotated image collection Content-based image retrieval Similar annotated images Yellow, bloom, pretty Meadow, outdoors, dandelion Mary’s garden, summer Text processing Semantic resources Selection of the final annotation flower Candidate keywords with probabilities/scores Plant 0.3 Flower 0.3 Garden 0.15 Animal 0.05 Human 0.1 Park 0.1 ConceptRank

Slide 24 MUFIN Image Annotation Framework  Modular architecture for image annotation  There is an extensible set of modules that implement the same interface  Can be arbitrarily combined into an “annotation pipeline”  There is an “annotation record” object that is passed from one module to another  Carries information about query and candidate keywords, current estimate of probabilities, and any other knowledge deemed relevant by individual modules  Clear structure, easy adaptability  Upgrade from MPEG7 to DeCAF descriptors = replacing one module without disturbing others  MUFIN Image Annotation application

Slide 25 MUFIN Image Annotation – current version  Objective:  Annotation with semantic relationships evaluated by ConceptRank  Basic decisions:  Reference dataset: 20M Profiset  20M high-quality images with rich and systematic annotation  20 keywords per image on average  Obtained from a commercial web-site selling stock images  Evaluation of visual similarity: DeCAF descriptors  State-of-the-art for image content description  Indexing: PPP-codes  Source of semantic information: WordNet  Lexical database of English  Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms – synsets  Synsets are interlinked by conceptual-semantic and lexical relations  Hypernyms, hyponyms, …

Slide 26 WordNet ConceptRank details  Basic objects for semantic analysis: synset  Step 1: Transformation of keywords to synsets  For keywords with multiple meanings, there exist more synsets (e.g. mouse). How do we decide which synset(s) to pick?  There is an additional resourse that for most English words lists the possible synsets together with a score that corresponds to the frequency of use of the keyword in the meaning described by the given synset  We take a fixed number of the most probable synsets for each keyword  There may be many synsets retrieved by the previous step, which could lead to costly processing of the semantic network  Therefore, only a fixed number of the most probable synsets are used to build the network

Slide 27 WordNet ConceptRank details (cont.)  Step 2: Construction of WordNet-based semantic network  Which relationships are interesting?  For now: Hypernyms, hyponyms, holonyms, meronyms  Which relationships should be used to extend the network and which should be used only to add edges between existing nodes?  Extending mode: bottom-up relationships (hypernyms, maybe holonyms)  How shall we compute the weights of semantic network edges for each relationship?  Bottom-up relationships: edge weight 1  Top-down relationships: edge weight 1/(number of child nodes) dog cat animal mouse computer keyboard

Slide 28 The complete annotation pipeline  Similarity search  Extraction of the DeCAF descriptor from the query image  Retrieval of k visual nearest neighbors  Semantic analysis  Frequency analysis of keywords + normalization  Transformation of keywords to synsets  Construction of WordNet-based semantic network  Computation of ConceptRank  Selection of the final annotation  Mapping synsets with probabilities to vocabulary concepts

Slide 29 Overview of annotation parameters  Similarity search  # of similar images  Transformation of keywords to synsets  # of most probable synsets per keyword  # of most probable synsets that enter the network construction  Construction of WordNet-based semantic network  types of relationships  for extending network  for adding edges  weights of edges for individual relationships  Computation of ConceptRank  restart probability  weights of individual relationship matrices

Slide 30 Annotation query example  Input: ? Vocabulary: all English words

Slide 31 Example: kNN search and initial synsets  kNN search: k=5  Keywords to synsets: at most 3 most probable synsets per keyword  Merge synsets: 20 synsets with the highest probability  beak, cotswolds, flamingoes (2), head, janes (2), pink, site, slimbridge (2), trust, water, wetlands, wildfowl  beak, cotswolds, flamingoes (2), head, janes (2), pink, preening, site, slimbridge (2), trust, water, wetlands, wildfowl  american, birds, darwin, flamingo (2), flap, flapping (2), galapagos, greater (2), islands, markings, phoenicopterus, race, ruber, south, wing, wings (2)  aythya, drake, duck, sv, swimming ? flamingo0,185 greater0,062 wildfowl0,062 Cotswolds0,062 Aythya0,062 wetland0,062 site0,058 head0,049 pink0,047 water0,046 trust0,037 wings0,037 duck0,034 Drake0,031 drake0,031 swimming0,031 Galapagos_Islands0,031 beak0,025 American0,023 Initial synsets:

Slide 32 Example: semantic network – hypernyms

Slide 33 Example: annotation results Top 5 keywords – demonstration settings Flamingoes (4.15) Duck (2.44) Wildfowl (1.74) Birds (1.48) Wetlands (1.41) Top 5 keywords – 70 images, 7 synsets/kw, 100 init. synsets, all relationships Animal (2.68) Bird (2.42) Travel (2.30) Vertebrates (2.04) Swimming (1.42)

Slide 34 Experimental evaluation  ImageCLEF 2014: Scalable Concept Image Annotation  Focus on concept-wise scalability  No reasonable training data  Provided development queries, GT and evaluation scripts Vocabulary: aerial airplane baby beach bicycle bird boat bridge building car cartoon castle cat chair child church cityscape closeup cloud cloudless coast countryside daytime desert diagram dog drink drum elder embroidery fire firework fish flower fog food footwear furniture garden grass guitar harbor hat helicopter highway horse indoor instrument lake lightning logo monument moon motorcycle mountain nighttime overcast painting park person plant portrait protest rain rainbow reflection river road sand sculpture sea shadow sign silhouette smoke snow soil space spectacles sport sun sunrise/sunset table teenager toy traffic train tricycle truck underwater unpaved wagon water GT: countryside daytime grass horse plant

Slide 35  Development data results  Processing time: 1500 ms on average for parameters used in the table  1000 ms for descriptor extraction (can be improved)  300 ms for similarity search  Competition results: a close 2 nd place Experimental evaluation (cont.) MP-cMR-cMF-cMP-sMR-sMF-sMAP-s Random baseline DISA baseline – freq. analysis, 1 synset per kw DISA baseline with multiple synsets per kw DISA with hyper-hypo DISA with hyper-hypo-holo-mero

What next?

Slide 37 Summary and Future work  Already done  The ConceptRank algorithm  Working annotation system  Good results in the ImageCLEF competition  Near future  More evaluations  Influence of dataset size and quality, approximation params, …  Google ground truth  Publish or perish  More distant future  Other resources of semantic relationships  Ontologies, Word2Vec  Relevance feedback  Combined architecture: search-based approach and modern NN classifiers