Tweet Search Cody, Darin, Kyle, Vincent
General Architecture Application GUI Index Builder/Loader Datastructure TriTree Posting Lists Tweet Tweets Ranker Getter Tokenizer Utilities Visitor Parser Query Handler Reporter
Page Ranking TF-IDF Based Works with Boolean and multiword queries Content Based Number of Links Number of # tags Number replies Ratio of words spelled correctly In/Out Links Removed
Page Ranking All ranking calculations were done at query time. Now mostly done at index time. TF-IDF Based Minimal calculation still done at query time. Content Based All calculated at index time. Made index files a lot larger both on disk and in memory.