Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stemming Stemming is crude chopping of Affixes in inflected words. It is used to coalesce terms for effective Information Retrieval. The base version of.

Similar presentations


Presentation on theme: "Stemming Stemming is crude chopping of Affixes in inflected words. It is used to coalesce terms for effective Information Retrieval. The base version of."— Presentation transcript:

1 Stemming Stemming is crude chopping of Affixes in inflected words. It is used to coalesce terms for effective Information Retrieval. The base version of word is Stem, while pieces attached to stem are Affixes. Example: Affixes, Stem: Affix, and Affix: es Functional Stem: Function Affix: al

2 Lemmatization It is more complex form of stemming. It implies identifying synonyms of the words in user queries. Example: Engineering -> Technology Attire -> Wear, Dress Stemming and Lemmatization are used to simplify the job of designer and better serve users.

3 Implementation Step 1: A. Expand query Query Input: Query Output: Office Attire wear apparels dress for Eradicate Mosquitoes remove kill mosquito B. Assign QueryId 1-Eradicate 2- Mosquitoes 3-remove 4-kill 5-mosquito 1- Office 2-Attire 3-wear 4-apprales 5-dress for

4 Implementation (cont’d) Step 2: Map Function Input: Output: map(String key, String value) // key: QWord // value: SERP text FOREACH Dword IN value EmitIntermediate(Qword,Proximity word); NEXT

5 Implementation (cont’d) Step 3: Reduce Function Input: Output: reduce(String key, Iterator values) // key: Qword // values: a list of Proximity words QwId=fn_GetQueryId(Qword) FOREACH v IN values IF word IS verb Emit(QwId,word+Pword); ELSE Emit(QwId,Pword+word); NEXT

6 Implementation (cont’d) Step 3: Reduce Function Input: Output: reduce(String key, Iterator values) // key: Qword // values: a list of Proximity words QwId=fn_GetQueryId(Qword) FOREACH v IN values IF word IS verb Emit(QwId,word+Pword); ELSE Emit(QwId,Pword+word); NEXT


Download ppt "Stemming Stemming is crude chopping of Affixes in inflected words. It is used to coalesce terms for effective Information Retrieval. The base version of."

Similar presentations


Ads by Google