Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Retrieval Homework #1 Members: Wesley, Lbr, Shuang CSIE, NCU.

Similar presentations


Presentation on theme: "Information Retrieval Homework #1 Members: Wesley, Lbr, Shuang CSIE, NCU."— Presentation transcript:

1 Information Retrieval Homework #1 Members: Wesley, Lbr, Shuang CSIE, NCU

2 Outline Introduction Stemming Algorithm Suffix Tree Performance Conclusion

3 Stemming Algorithm (optional) Goal of stemming – improve performance and require less resources by reducing the number of unique words –Ex. “ computable ”, “ computation ”, “ computability ” Porter Algorithm (most commonly accepted)

4 Suffix Tree Library libsfxdisk-1.2 is a Fast indexing library based on suffix tree Storing, retrieving, deleting and dumping/loading the database

5 Indexing Dir Name File Name Stem File Reader Dir Reader Filter Suffix Tree Delete Stop Words Index File (Optional)

6 Searching Search Engine Index Key Word Print Out Results

7 Performance Total Indexing Time –Spend more time –One file take about one minute Average searching time –very quick http://140.115.156.49/~wesley/IR.html

8 Future To add stemming scheme To limit indexing time Additional searching –AND, OR


Download ppt "Information Retrieval Homework #1 Members: Wesley, Lbr, Shuang CSIE, NCU."

Similar presentations


Ads by Google