Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Knowledge Based Intelligent Search Engine.

Similar presentations


Presentation on theme: "Project Knowledge Based Intelligent Search Engine."— Presentation transcript:

1 Project Knowledge Based Intelligent Search Engine

2 Project Components Web Crawler (will be given) Text Processor (Will be Given) Granule Finder Geometry of Granular Structure Graph of Granular Structurre

3 Text Processing 1.Input Data http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.data.html 2. Output: 1.Doc_names, 2.Stems, 3.Position of words in document 4.Stemmed Table (stem and Stemed words)

4 Project Components Tokenizer (will be given)will be given TFIDF Keywords Stemminghttp://xanadu.cs.sjsu.edu/~drtylin/ classes/cs257/tokenizerhttp://xanadu.cs.sjsu.edu/~drtylin/ classes/cs257/tokenizer

5 Project Components Tokenizer : Recognize the tokens –Watch out international languages –Stemming Keywords or stopword list (TFIDF) Concept (frequent ordered sets of keywords)

6 Project Components Oracle 1. Find the high frequency keywords 2. Find the high frequency pairs 3. Find the high frequency triples, and etc. Those will be the input to simplex and graph theory teams

7

8 Project Components Oracle 1. Upload datasort into oracle 2. Convert to bitmap; 3. Each bit string has name (attribute value pair); 16 bit for column; 16 bit for values) 3. Rotate 90 degree and output into files

9 Project Components Hardware 1 1. read_sector (address) 2. read_track (address) 3. read cylinder (address) 4. read Volume (address)

10 Project Components Hardware 2 1. VSAM-tree (do not assume it can be in the main memory at same time; need managements) 2. Bit-AND, Bit-count


Download ppt "Project Knowledge Based Intelligent Search Engine."

Similar presentations


Ads by Google