Download presentation
Presentation is loading. Please wait.
1
CyberMiner Software Architecture Group
Kimberly West, Nadia Noori, Stanislav Minkevych Basic Goal : Web Search Engine that : Accepts list of keywords Returns list of URLs whose description contains any of the given keywords Uses KWIC Key Word In Context to maintain database of URL & description
2
Requirements Specification
Functional : After input, the descriptor part of the line is circularly shifted by repeatedly removing the first word and appending it to the end of the line Outputs a list of all circular shifts of the descriptor parts of all lines in alphabetically ascending order, together with their corresponding URLs No noise words such as “a”, “the”, or “of” at the start of output list lines Grow indices with possible later additions
3
Requirements Specification
Non-Functional : Easily Understood & Used – clear use capabilities, features, simplicity to design Portability/ Reuse – not restricted to certain operating systems, machines, or certain developers, anyone can use the system & understand its architecture to adapt it to their environment, few system limitations Traceability – object oriented style using abstract data types, each process is linked to a specific individual module Good Performance & Responsive – readily & easily reacts to changes, output to input ratio, time factor
4
Components & Connections : Indexing
Repository contains the full HTML of every web page documents are stored one after the other and are prefixed by ID, length, and URL requires no other data structures to be used in order to access it (helps with data consistency and makes development easier) Index keeps information about each document, is a fixed width index, ordered by docID contains current document status, pointer into the repository, a document checksum, various statistics If the document has been crawled, also contains a pointer into a variable width file called docinfo which contains its URL and title Otherwise the pointer points into the URL list which contains just the URL
5
Line Storage Create, access, and possibly delete character, words, and lines listens for InputEvent using the interface LSListener Store the lines LineStorage generates event called LSEvent
6
Line Storage Procedure setchar (l-line, w-word, c-char, a)
Function char (l-line, w-word, c-char) returns an character representing the c-th character in the w-th word of l-th line return blank if out-of-range Function word ( l-line) returns the number of words in line l
7
Subprogram call System I/O Implicit invocation Master Control
Line Storage Alphabetizing Control Input Input medium Output Output medium Circular Shift Searcher
8
CyberMiner Engine Searches indexed keywords Uses Boolean arguments
Case-sensitivity selector
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.