Presentation is loading. Please wait.

Presentation is loading. Please wait.

Final Project Presentation

Similar presentations


Presentation on theme: "Final Project Presentation"— Presentation transcript:

1 Final Project Presentation
CS/SE Fall 2015 Final Project Presentation Presented By - Karthik Kannambadi Sridhar - Ramakrishnan Sathyavageeswaran - Vaidehi Jariwala Dec 3rd 2015 A KWIC BASED SEARCH ENGINE

2 Agenda Product Description Next Questions – What? How? Why?
Functional Requirements Non Functional Requirements Use Cases Traceability of Functional, Non Functional and Use Cases Cyberminer and KWIC System Architecture Alternate Architectures and Tradeoffs Implementation A short Demo of Cyberminer

3 Product Description Architect a Web Search Engine (Cyberminer) using a KWIC Software System. The Cyberminer system should be able to accept a URL and description pair, where the description is an ordered set of lines, where each line is an ordered set of words, and each word is an ordered set of characters. The URL follows the generic web address format. The Cyberminer system should then be able to process the descriptions through a Key Word In Context (KWIC) system, and the results are to be stored in a database mapped with its corresponding URL. The Cyberminer system should be able to provide the user with addition, deletion and search capabilities. The Cyberminer system will use an Object-Oriented architectural style, and shall be any standard web browser

4 Next Questions What ? How ? Why ?

5 What?

6 What ? Screenshot Build a search engine based on KWIC system. Phase 1
Build KWIC system in J2EE Use in-memory for storing KWIC indices Provide web based interface for the User

7 What ? (contd.) Phase 2 Use the KWIC System indexing the descriptions
Use Database for permanent storage of URL and description. Provide search capabilities Allow user to add URL and description Allow user to delete record Allow user to Filter out symbol from search

8 Functional Requirements
The KWIC index system of Cyberminer shall accept input as an ordered set of lines, where each line is an ordered set of words and each word is an ordered set of characters The Cyberminer shall accept entries in two parts namely an URL and descriptor. The URL and descriptor should follow the below syntax pattern. Descriptor shall consist identifiers which can be letters from a through z, A through Z, numbers from 0 through 9. Identifier: = {letter | digit}+ Letter: = [‘a’ | ‘b’ | … | ‘y’ | ‘z’ | ‘A’ | ‘B’ | … | ‘Y’ | ‘Z’] Digit: = [‘1’ | ‘2’ | … | ’9’ | ‘0’] URL shall follow below pattern <identifier>.<identifier>.[edu|com|org|net|co|me] The descriptor part, whose syntax is: identifier {‘ ‘ identifier}* User of Cyberminer shall able to insert a URL and a corresponding descriptor by means of clicking a submit button to save the information into the system.

9 Functional Requirements (contd.)
The KWIC system of Cyberminer shall take descriptor as the input for indexing the words. Each line would then be circularly shifted, by repeatedly removing the first word and appending it at the end of the line. The KWIC index system shall remove the noise works such as “a”, “the”, and “of” from all circular shifted lines. The KWIC index system shall then sort the noise eliminated lines in the ascending alphabetical order. The system shall store the input as it is .i.e. it shall be case sensitive, for input and retrieval. The system shall store the final alphabetized input lines along with its corresponding URL in the database.

10 Functional Requirements (contd.)
The user should able to perform search by providing the complete descriptor or part of the descriptor i.e. identifiers to retrieve the URL along with its original descriptor. The Cyberminer system shall allow case sensitive search i.e. retrieve the input as stored. The Cyberminer system shall display URL and descriptor as part of the search results.

11 Functional Requirements (contd.)
The system shall allow the user to choose the mode of search such as “OR”, “AND” or “NOT”. The user shall perform OR search by providing keywords with space in-between in the search input box. The user shall perform AND search by providing “&&” symbol in-between keywords with in the search input box. The user shall perform NOT search by providing “!” symbol before the keyword with in the search input box. The Cyberminer system shall be able to run concurrently; multiple instantiation. The Cyberminer system shall support deletion of out-of-date URL: and the corresponding description from the database.

12 Functional Requirements (contd.)
The Cyberminer system shall list the query result in ascending/descending alphabetical order; most/less frequently accessed order. The Cyberminer system shall allow setting the number of results to show per page, and navigation between pages. The Cyberminer system shall support autofill at the user interface, using the already stored description in the database. The Cyberminer system shall filter out symbols that are not meaningful, according to the user configuration.

13 Non Functional Requirements
Enhance-ability: The Cyberminer System shall be enhance able and support changes over the course of time. Issue: What kind of changes are anticipated? Resolution: To be able to move from Phase 1 to Phase 2, the KWIC system shall be modular enough, to fit into a lager system comprising of a backend Database, and a rich set of capabilities for the user over the User Interface. Understandability: The UI shall be user friendly, in terms of making the interface understandable by the end user. Issue: Should the Interface have a document for the user, or should the Interface itself be intuitive enough to take away the need for a user document? Resolution: The Interface shall be built to make it as intuitive as possible, so that the user spends minimal amount of time in getting familiar with the controls the end user is presented with.

14 Non Functional Requirements (contd.)
Portability: The system shall be platform independent, making it easily deployable on all platforms. Issue: Is the System expected to execute on all Host Operating System platforms? Or is the system expected to support multiple browsers on the same host machine? Resolution: The system shall be deployable on any Host Operating System, and shall also support any of the browser like Internet Explorer, Chrome, Mozilla Firefox or Safari. Developing the tool in J2EE framework is an excellent way of adhering to this specific requirement. Scalability: The system shall be scalable in terms of being able to process large amounts of data, without any major changes to the system. Issue: How much of data stores and retrievals are projected? The backend Database system selection shall be impacted by the projected statistics. Resolution: Keeping in mind the huge amount of data that might eventually be stored in the database, the best option shall be to use a distributed database, for any extreme data activity by the end user.

15 Non Functional Requirements (contd.)
Reusability: Reusable code is a major requirement in the current industry scenario. Issue: How to identify which all components might be candidates for reuse in other contexts? Resolution: In order to adhere to this requirement, OOP architecture shall be used, paving way for modularized design of the system. This enables potentially all of the components used in the system to be reused in other contexts. This shall also help achieve enhance ability of the system. Reliability: The system shall be reliable in terms of providing the same set of controls and output to the user, every single time. Issue: Adding and retrieval to and from the database have to be exclusive, which otherwise might result in conflicting results. Resolution: The database system shall be chosen in such a way that it supports atomic inserts and retrievals at the lowest level, which will enable achieve this requirement.

16 Non Functional Requirements (contd.)
Performance: The system shall provide good performance from the user’s perspective. Issue: What is considered as good performance? The requirement is vague. Resolution: Specific performance numbers are to be considered right from the design phase, so that the overall system achieves the target numbers. Insert – Shall take less than a second, for a single element insertion. Delete – Shall take about two seconds, to delete and to display the deleted elements. Search – Shall take less than a second, to display one page of results. Robustness: The system shall be robust enough to handle all sorts of erroneous scenarios and provide the corresponding errors. Issue: What about all the invalid scenarios? How should those be handled? Resolution: Right from the design phase, equal importance shall be given to identify all possible invalid scenarios, and the Interface shall be designed in order to be able to display detailed error and warning messages to the user, accordingly.

17 Use Case Scenarios Use case ID 1 Primary Actor User Main Flow
User enters URL in the input box and a description, and then clicks on “ADD URL”. This input is then passed to KWIC system for circular shift and alphabetizer processing, and then the output is going to be stored in the database. Use case ID 2 Primary Actor User Main Flow User enters an invalid URL. This input is processed and Error message “Input should be a valid URL” shall be shown. Use case ID 3 Primary Actor User Secondary Actor KWIC system Main Flow User enters the input with special characters #, $, %, *, etc. in the input box. This input is processed and Error message “Input Should not contain special characters” shall be shown.

18 Use Case Scenarios(contd.)
Use case ID 4 Primary Actor User Secondary Actor Cyberminer system Main Flow User enters the the set of words to be filtered out, from the output, via the User Config menu. Use case ID 5 Primary Actor User Secondary Actor Cyberminer system Main Flow User enters the search query with AND, OR, NOT specifier. The Cyberminer system creates a search query to the Elastic Search module and retrieves the relevant entries and displays it to the user. Use case ID 6 Primary Actor User Secondary Actor KWIC system Main Flow User enters the specific entry to be deleted from the database. The Cyberminer systems creates a search query accordingly, and deletes the entry, if found in the database. Returns an error to the User, otherwise.

19 Use Case Diagram

20 Some Non- Functional Requirements
Traceability Matrix Some Non- Functional Requirements NFR Use Case Requirement Understandability, Performance 2.4.2, 2.4.7 1 FR User shall be able to add URL and Description Robustness 2.4.8 2 FR KWIC system of cyberminer will accept input as ordered set of lines 3 FR System will not allow user to enter number as an input 5 FR User can perform different search operations 6 FR User shall be able to delete the outdated URL Understandability, Performance, Robustness, Scalability 2.4.2,2.4.8,2.4.6,2.4.7 FR User shall be able to search URL and Description Understandability, Performance, 2.4.2,2.4.7 FR System displays URL and Description as part of search result Performance 2.4.7 FR System displays output in ascending order

21 How ?

22 CyberMiner System Architecture: MVC
Chosen Architecture CyberMiner System Architecture: MVC View compromises of the GUI specific components of Cyberminer. It enables users to add, delete and search for entries to and from the backend. The control layer consists of a delegate which forwards the transactions from View to the corresponding element in the Model layer, which consists of individual processing elements for add, delete, search and user configuration. Control interacts with the previously defined KWIC system as part of the Insert transaction and passes on the Alphabetized output to the Model. Model, in turn stores it in the Elastic database.

23 Cyberminer System – Class Diagram

24 Patterns Used Mediator Pattern : Single Servlet
MVC Pattern : The Cyberminer System Mediator Pattern : Single Servlet Façade Pattern : The KWIC System

25 Alternate Architecture Style #1
Here, KWIC System is included within the Model layer. The Model specific components directly interact with the KWIC system, without any external interfaces Advantages: Performance of KWIC system access is good. Disadvantages: High coupling of KWIC system in the overall system. High degree of dependency on the KWIC system APIs within the Model. KWIC system is not replaceable with a different one.

26 Alternate Architecture Style #2
Here, Model layer directly interfaces with the KWIC system, instead of Control interfacing with the KWIC System. Advantages: Control layer is simple all the processing happens in the Model. Disadvantages: Model layer is tightly coupled with KWIC System. Model does more than interfacing with the Elastic database, which is against the MVC guidelines. Model layer is no more reusable and is bulky. Performance of the Model layer is adversely affected, and hence of the Cyberminer system.

27 Alternate Architecture Style #3
Here, KWIC System is interfaced with the Control Layer and the Model layer. Alongside, KWIC System interfaces with the Elastic Database too. Advantages: High performance inserts and retrievals to and from the Database. Disadvantages: KWIC System is tightly coupled with the Control and Model layers, and with the Elastic Database. This affects reusability of the system modules. Portability and Scalability of the Cyberminer system also gets affected.

28 Comparing all the Options
Chosen Architecture Arch Alternative #1 Arch Alternative #2 Arch Alternative #3 Modifiability Algorithms Very Good Bad Data Representation Not Good Enhanceability Performance Space Good Time Reusability

29 Why?

30 Why? Why Object Oriented (or) Abstract Data Type for KWIC System?
This style gives strong support to modifiability and scalability, which are the prime Non-functional requirements Modifiability: Change in processing algorithm If we change processing in one algorithm then it will not affect others. Change in Data representation Scalability: There is no constraint on number of lines processed and stored in Database. Why MVC architecture for CyberMiner System? This gives strong support for a User Interface based system. It separates out Business and Presentation Logic. Different views for different users can be provided with ease. It gives strong support for Reusability, Modifiability. Enables good system maintenance at low costs. Uses Observer Design Pattern, to notify the Model elements, whenever there is any change in the input/output.

31 Let us have a short demo!

32 Our Unique Selling Point
Chose the best architectural style as per given non functional requirements. Came up with many alternate architectural styles and chose the best of them by performing trade-off analysis. Used MVC which helps to separate out business and presentation logic which helps us to provide different views according to different users, and hence increases reusability and decreases complexity. Also, one of the pattern used in MVC is behavioral observer design pattern in which if one object changes, it automatically notifies remaining object without having knowledge of them.

33 Thank you  Any questions ?..


Download ppt "Final Project Presentation"

Similar presentations


Ads by Google