Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp.118-123)

Similar presentations


Presentation on theme: "Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp.118-123)"— Presentation transcript:

1 Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp.118-123) 4.Chapter 5.3.1 Query expansion through local clustering (pp. 124-126)

2 Adaptive linear model Let X1, X2, …, Xn be n vectors (of n documents). D1  D2={X1, X2, …, Xn}, where D1 be the set of relevant documents and D2 be the set of ir- relevant documents. D1 and D2 are obtained from users feedback. Question: find a vector w such that  W i  X ij +1>0 for each X j  D 1 and i=1 to m  W i  X ij> +1<0 for each X j  D 2 i=1 to m

3

4 W0W0 W1W1 W2W2 W3W3 WnWn +1 Threshold Output =sign(y) X0X0 X1X1 X2X2 X3X3 XnXn

5 Remarks: W is the new vector for query. W is computed based on the feedback, i.e., D 1 and D 2. The following is a hyper-plane:  w i  X i +d=0, where W=(w 1, w 2, …, w m ) i=1 to m The hyper-plane cuts the whole space into two parts and hopefully one part contains relevant docs and the other contains non-relevant docs.

6 Perceptron Algorithm (1) For each X  D 1, if X · W+d<0 then increase the weight vector at the next iteration: W=W old +CX. d=d+C. (2) For each X  D 2 if X · W+d>0 then decrease the weight vector at the next iteration: W=W old -CX. d=d-C. C is a constant. Repeat until X · W>0 for each X  D 1 and X · W<0 for each X  D 2.

7 Preceptron Convergence Theorem The perceptron algorithm finds a W in finite iterations if the training set {X 1, X 2, …, X n } is linearly separable. References: Wong, S.K.M., Yao, Y.Y., Salton, G., and Buckley, C., Evaluation of an adaptive linear model, Journal of the American Society for Information Science, Vol. 42, No. 10, pp. 723-730, 1991. Wong, S.K.M. and Yao, Y.Y., Query formulation in linear retrieval models, Journal of the American Society for Information Science, Vol. 41, No. 5, pp. 334-341, 1990.

8 An Example of the perception Algorithm X 1 =(2,0), X 2 =(2,2), X 3 =(-1,2), X 4 =(-2,1), X 5 =(-1,-1), X 6 =(1/2,-3/4) D 1 ={X 1, X 2, X 3 }, D 1 ={X 4,X 5,X 6 },W=(-1,0). Set d=0.5 X3X3 X2X2 X1X1 X4X4 X5X5 X6X6 W WX 1 +0.5=-0.5, W = W old +X 1 =(1,0)

9 X3X3 X2X2 X1X1 X4X4 X6X6 W WX 2 +0.5= 2.5>0, WX 3 +0.5=-0.5,W=W old + X 3 =(0,2) X3X3 X2X2 X1X1 X5X5 X6X6 X1X1 X3X3 X2X2 X4X4 X5X5 X6X6 W=(0,2) WX 4 +0.5=2.5>0, W=W old - X 4 =(2,1)

10 X3X3 X2X2 X1X1 X4X4 X5X5 X6X6 W=(2,1) WX 5 +0.5=-2.5<0, WX 6 +0.5=3/4>0, W=W old -X 6 =(3/2,7/4) WX 1 +0.5=3.5, WX 2 +0.5=7, WX 3 +0.5=2.5, WX 4 +0.5= -3/4,WX 5 +0.5= -11/4, WX 6 +0.5=-1/16, The algorithm stops here. X3X3 X2X2 X1X1 X4X4 X5X5 W=(3/2,7/4) X6X6

11 LMS Learning Algorithm Given a set of input vectors {x 1, x 2, …, x L }, each has its own desired output d k, for k=1, 2, …, L, Find a vector w such that L  (d k -w · x k ) 2 is minimized. K=1 For IR, d k is just the order the user gives. From “ Neural networks: algorithms, applications and programming techniques, by James A. Freeman, David M. Skapura. 1991. Addison-Wesley Publishing Company. ”

12 The algorithm 1. choose a vector w(1)=(1, 1,.., 1). 2.For each x k, compute 3.  k 2 (t)=(d k -w · x k ) 2 4.W(t+1)=w(t)+2   k x k. 5.Repeat 1-4 until the error is reduced to be acceptable.  --a parameter. If  is too large, the algorithm will never converge. If  is too small, the speed is slow. Choose a number between 1.0 and 0.1 in practice. You can choose a bigger number at the beginning and reduce gradually.

13

14 Query Expansion and Term Reweighting for the Vector Model : set of relevant documents, as identified by the user, among the retrieved documents; : set of non-relevant documents among the retrieved documents; : set of relevant documents among all documents in the collection; : number of documents in the sets respectively; : tuning constants.,,

15 Query Expansion and Term Reweighting for the Vector Model Standard_Rochio : Ide_Regular : Ide_Dec_Hi : Where is a reference to the highest ranked non-relevant document.

16 Evaluation of Relevance Feedback Strategies (Chapter 5) Simple way: use the new query to search the database and recalculate the results Problem: used feedback information, it is not fare. Better way: just consider the unused documents.

17 Query Expansion Through Local Clustering Definition Let be a non-empty subset of words which are grammatical variants of each other. A canonical form from of is called a stem. For instance, if then Definition For a given query, the set of documents retrieved is called the local document set. Further, the set of all distinct words in the local document set is called the local vocabulary. The set of all distinct stems derived from the set is referred to as.

18 Association Clusters Definition The frequency of a stem in a document,, is referred to as. Let be an association matrix with rows and columns, where. Let be the transpose of. The matrix is a local stem-stem association matrix. Each element in expresses a correlation between the stems and namely, (5.5) (5.6)

19 Association Clusters Normalize Definition Consider the -th row in the association matrix (i.e., the row with all the associations for the stem ). Let be a function which takes the -th row and returns the set of largest values, where varies over the set of local stems and. Then defines a local association cluster around the stem. If is given by equation (5.6), the association cluster is said to be unnormalized. If is given by equation 5.7, the association cluster is said to be normalized. (5.7)

20 Interactive Search formulation A stem s u that belongs to a cluster associated to another stem s v is said to be a neighbor of s v. Reformulation of query, for each S v, in the query, select m neighbor stems from the cluster S v (n) and add them to the query.


Download ppt "Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp.118-123)"

Similar presentations


Ads by Google