Download presentation
Presentation is loading. Please wait.
1
Information Retrieval Homework #1 Members: Wesley, Lbr, Shuang CSIE, NCU
2
Outline Introduction Stemming Algorithm Suffix Tree Performance Conclusion
3
Stemming Algorithm (optional) Goal of stemming – improve performance and require less resources by reducing the number of unique words –Ex. “ computable ”, “ computation ”, “ computability ” Porter Algorithm (most commonly accepted)
4
Suffix Tree Library libsfxdisk-1.2 is a Fast indexing library based on suffix tree Storing, retrieving, deleting and dumping/loading the database
5
Indexing Dir Name File Name Stem File Reader Dir Reader Filter Suffix Tree Delete Stop Words Index File (Optional)
6
Searching Search Engine Index Key Word Print Out Results
7
Performance Total Indexing Time –Spend more time –One file take about one minute Average searching time –very quick http://140.115.156.49/~wesley/IR.html
8
Future To add stemming scheme To limit indexing time Additional searching –AND, OR
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.