Popular Ranking Algorithms Prepared by -Ranjan Dash
Contents Efficient ways of Ranking Algorithms for ranking Sort Algorithm Scan Algorithm FA Algorithm TA Algorithm
Efficient ways of Ranking Besides choosing a proper ranking function, efficient way to execute also decides the performance. So given a ranking function the execution of this following a particular ranking algorithm plays a key role in the efficiency.
Algorithms for ranking Prominent Algorithms to get top K results are Sort Algorithm Scan Algorithm FA Algorithm TA Algorithm
Sort Algorithm Most simple way to decide the top K results of a ranking function like Score (ObjectId) = Linear combinations of attributes is to sort the result and take the top K. This will take nlogn time. Very slow for very large relations where n is quite large.
Scan Algorithm Keep K tuples in a buffer. Scan this buffer for every tuple in the relation. Replace the lowest one in the buffer if the input tuple is more than that. Takes O(n.K) time. Still low for a large n.
FA Algorithm Fagin’s Algorithm known as FA Algorithm. Developed by Ron Fagin. Takes the help of data structures prepared offline. Though there is a cost associated with these data structures, yet the amortized cost is very low. Sorted access to the attributes. Supports GetNext() operation and is sequential. One sorted table per attribute. Random access through the ObjectId. Supports Get(ObjId) operation. The pre processing requires the preparation of above two types of data structures which will be used again and again during the processing.
FA Algorithm Step1 Example of determining top 1 restaurant based on the given ranking function Score(RestId) = 2.Cusine + Location RestId Cusine Location 1 2 3 4 5 6 7 RestId Cusine 4 6 3 5 7 1 2 RestId Location 2 5 4 1 7 3 6 Sorted for Cusine Sorted for Location Original relation
FA Algorithm Step1 Do the GetNext from both sorted tables in round robin. Stop when K objects have been seen in common from all lists – 1 in our example RestId Cusine 4 6 3 5 7 1 2 RestId Location 2 5 4 1 7 3 6 1st Round 2nd Round 3rd Round 4 2 6 5 3 RestId 4 is winner in our case Sorted for Location Sorted for Cusine
FA Algorithm Step2 Random access to calculate the score for all visited tuples in step 1. Take the top K after evaluation This algorithm is applicable if the problem shows monotonic property. The worst case will be same as scan algorithm. The worst case memory requirement is unbounded.
TA Algorithm Known as Threshold Algorithm Similar to FA but sorted access and random access are interleaved. Step 1 Do sorted access (and corresponding random accesses) until you have seen the top K answers. Step 2 Determine threshold value (Hypothetical tuple) based on objects currently seen under sorted access. K objects with overall score ≥ threshold value ? Stop. Else go to next entry position in sorted list and repeat step 1 Faster than FA. Requires less memory.