Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.

Similar presentations


Presentation on theme: "Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423."— Presentation transcript:

1 Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423

2 Next lecture Paper: Personalized Mobile search engine Some concepts Ontology Cyc, WordNet, ODP(Open directory project) Entropy disorder, uncertainty the smaller, the better Ranking SVM (Support Vector Machines) Supervised machine learning K-means clustering

3 Review on information retrieval Text representation Find all the unique words/terms in all documents, each term is a feature Each document is represented as a vector Weight for each term Term frequency: for one particular document Inversed document frequency : the same for all documents Query is also a vector Comparison with Query The similarity between query vector and each document vector Cosine similarity or other distance measure (Euclidean distance) Ranking The documents with higher similarity are more relevant to query PageRank

4 Evaluation Precision and Recall For a query, If a system finds 200 results, among them 50 are relevant. The human labels ( model solutions) have 120 relevant documents. Precision and recall curve AUC: Area under curve Top N precision ARR: the average rank of the documents rated as “relevant”

5 Personalised search Personalised information retrieval A related area is called Adaptive Hypermedia

6 Information gathering Information gathering approach Implicit, Explicit, Both Type of information Queries, clicked documents, snippets of documents Cashed web pages, dwell time on page, desktop documents Email, calendar items Tags and bookmarks on online social applications User supplied information User’s categorical interests Source of information Server side, Client –side, user intervention

7 Information representation User model Short-term interests, long-term interests Static, dynamic, periodic Terms or conceptual terms (use wordnet, ontology) Vector-based Models where user’s interests are maintained in a vector of weighted keywords (concepts). Semantic network based Models where user’s interests are maintained in a network structure of terms and related terms (concepts and related concepts)

8 Two directions Query adaptation or result adaptation Query expansion/adaptation implicit individualised User model Aggregate Usage information (search logs) Not user-focused Pseudo-relevance feedback Global analysis explicit individual, relevance feedback, interactive query expansion

9 Search results filtering/adaptation Different applications: individual, aggregate, web search or recommendation, databases search Typically use supervised machine learning Relevant, not relevant: binary classification Training data: Labeled data Assume clicked docs are relevant Machine learning methods KNN: K nearest neighbour Naïve Bayes SVM: Support vector machines …

10 Evaluation In lab setting 10-500 users Quantitative & Qualitative System performance User evaluation, system usability Data sets open web corpora, in-lab generated logs, TREC collection, search engine query logs subset of annotated documents from specific sites


Download ppt "Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423."

Similar presentations


Ads by Google