CIS 895 – MSE Project KDD-Research Entity Search Tool (KREST) Presentation 1 Eric Davis

Slides:



Advertisements
Similar presentations
“The Honeywell Web-based Corrective Action Solution”
Advertisements

Airline Reservation System
A Toolbox for Blackboard Tim Roberts
Sustainable Grading Ralph Westfall, Ph.D. April 2010
CS 325: Software Engineering April 7, 2015 Software Configuration Management Task Scheduling & Prioritization Reporting Project Progress Configuration.
Online School Registration System Solomon Ng Pei-Yu Wang Evan Chiu Curtis Wong.
Java Chapter 22 - Student. Why Java? ADVANTAGESDISADVANTAGES Has _____________ capabilities__________ ( times) than languages compiled directly.
CS 337 Final Project Presentation Asset Management and Tracking Developers: –Jimmy Hoo –Edwin Panameno –Manuel Segura –Sheng-Tian Lin Customers –Alexandre.
Effort in hours Duration Over Weeks Or Months Inception Launch Web Lifecycle Methodology Maintenance Phases Copyright Wonderlane Studios.
Introduction to Software Testing
AVERSTAR GROUP September 6, 2001NASA Software IV&V Facility1 SIAT C++ CSIP Presentation.
Working with SharePoint Document Libraries. What are document libraries? Document libraries are collections of files that you can share with team members.
Toll Free: Project Manager Tutorial.
Databases & Data Warehouses Chapter 3 Database Processing.
Karl Banks Aaron Birencwaig Andrew Harmic Jason Heintz Stephen Rodriguez Tyler Zaino.
BRUE Behavioral Reverse Engineering in UML as Eclipse Plugin MSE Presentation 1 Sri Raguraman.
New Vision Concept School Portal
CIS 895 – MSE P ROJECT Cooperative Robotics Organization Simulator (CROS) Presentation 1 on Mar 1 st, 2011 Daniel Christopher Greene 1.
Software Engineering Modern Approaches
DE&T (QuickVic) Reporting Software Overview Term
PowerPoint 2003 – Level 1 Computer Concepts Cathy Horwitz April 25, 2011.
Bogor-Java Environment for Eclipse MSE Presentation II Yong Peng.
AgentTool (III) Dynamic MSE Presentation 1 Binti Sepaha.
Class Instructor Name Date. Classroom Tips Class Roster – Please Sign In Class Roster – Please Sign In Internet Usage Internet Usage –Breaks and Lunch.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 22 Instructor Paulo Alencar.
ELP Helper MSE Project Presentation I Aghsan Ahmad Major Professor: Dr. Bill Hankley.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
CIS 895 – MSE Project KDD-Research Entity Search Tool (KREST) Presentation 2 Eric Davis
Multi-agent Research Tool (MART) A proposal for MSE project Madhukar Kumar.
Nutch in a Nutshell (part I) Presented by Liew Guo Min Zhao Jin.
Practical Project of the 2006 Joint International Master’s Degree.
Konza PrairieKonza Prairie Long-Term Ecological Research (LTER)LTER Henry Mikhail.
Mastergoal Machine Learning Environment Phase 1 Completion Assessment MSE Project Kansas State University Alejandro Alliana.
Key Applications Module Lesson 21 — Access Essentials
Student Curriculum Planning System MSE Project Presentation I Kevin Sung.
Authentication Training Guide 1 The Red Flag Ruling requires automotive dealerships to detect red flags that are applicable to their operation. After.
REAL TIME GPS TRACKING SYSTEM MSE PROJECT PHASE I PRESENTATION Bakor Kamal CIS 895.
Environment Model Building Tool MSE Presentation 1 Esteban Guillen.
MSE Presentation 1 By Padmaja Havaldar- Graduate Student Under the guidance of Dr. Daniel Andresen – Major Advisor Dr. Scott Deloach-Committee Member Dr.
Enhancing Forms with OLE Fields, Hyperlinks, and Subforms – Project 5.
Self-assembling Agent System Presentation 1 Donald Lee.
Natural Language to Machine Readable Format By: Damian Tamayo Presentation 1 – Oct. 12, 2009 CIS 895 – MSE Project.
Software Architecture in Practice Practical Exercise in Performance Engineering.
Graphical User Interface and Job Distribution Optimizer for a Virtual Pipeline Simulation Testbed Walamitien Oyenan October 8, 2003 MSE Presentation 1.
Evaluating & Maintaining a Site Domain 6. Conduct Technical Tests Dreamweaver provides many tools to assist in finalizing and testing your website for.
CIS 895 – MSE Project KDD-Research Entity Search Tool (KREST) Presentation 3 Eric Davis
ACIS Introduction to Data Analytics & Business Intelligence Database s Benefits & Components.
Function Points Synthetic measure of program size used to estimate size early in the project Easier (than lines of code) to calculate from requirements.
MSE Portfolio Presentation 1 Doug Smith November 13, 2008
MSE Presentation 1 Lakshmikanth Ganti
Department of Computing and Information Sciences MSE Project Presentation 1 A Three-tier On-line Model For Transaction- based Applications Using VB.NET.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Communication Model for Cooperative Robotics Simulator MSE Presentation 1 Acharaporn Pattaravanichanon.
Kansas State University Purchasing Contracts Management System (KSU – PCMS) Presentation 1 Date : 14 th October 2010 By Arthi Subramanian CIS 895 – MSE.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
T Project Review MalliPerhe Iteration 3 Implementation
GROUP PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.
3M Partners and Suppliers Click to edit Master title style USER GUIDE Supplier eInvoicing USER GUIDE The 3M beX environment: Day-to-day use.
Critical Step Web Site Presentation. Ing. Carlo Sanghez Consorzio Interuniversitario Nazionale per l'Informatica Napoli,
Informed Traveler Program and Applications Agile / Scrum Overview Jerry Inberg.
بشرا رجائی برآورد هزینه نرم افزار.
Bogor-Java Environment for Eclipse
 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.
System Design Ashima Wadhwa.
Hi and welcome to the Order Centre – Ordering training.
Introduction to Software Testing
COCOMO Models.
Honeywell Safety Products Pricing Portal Customer Guide
Jincheng Gao CIS895 – MSE Project
COCOMO MODEL.
Presentation transcript:

CIS 895 – MSE Project KDD-Research Entity Search Tool (KREST) Presentation 1 Eric Davis

Outline  Project Overview  Project Requirements  Project Schedule  Cost Estimation  Software Quality Assurance Plan  Prototype Demonstration  Questions / Comments

Project Overview  Goal To develop an application which allows web crawling, web searching and entity searching To reproduce the results from the entity search work of Kevin Chang and Tao Cheng at UIUC  Motivation Improve upon current web searches by making it easier to find contact information Results up front No more sifting through results pages for contact info By providing self-contained application, searching done on local machine rather than server

Project Overview (cont.)  Primary Focus: Entity Search Search for contact information Works like a database query from user’s perspective Breaks apart web pages to extract the entities Sample Queries  Kansas State University database professor #  Amazon Customer Service #phone Provide contact information + back-links

Project Overview (cont.)

Project Requirements  Requirements Broken into 4 sections Application Requirements Loading / Saving Data General System Level Requirements Web Crawling Requirements Web Searching Requirements Entity Searching Requirements  Each section has it’s own identifier + numbering system  Each requirement has an associated build release noted in the Vision Plan  Screenshot before each section

Application Screenshot

Application Requirements  ARI The program shall provide a GUI for user interaction  ARI The application shall be executable in a single step (e.g. without having to perform any setup steps)  ARI The application shall have a menu bar that contains at a minimum: a File menu and a Help menu  ARI The application shall allow the user to load a data set of web pages  ARI The application shall allow the user to save entity search results  ARI The application's Help menu shall contain at a minimum an About menu item  ARI The application's menu bar shall contain shortcut keys  ARI The application shall be platform independent  ARI The application shall be able to be minimized  ARI The application shall be able to be closed without having to perform a Control-C from the command line Bolded requirements represent Critical Project Requirements

Web Crawl Screenshot

Web Crawling Requirements  WCRI The user shall have the ability to perform a web crawl based on a starting website  WCRI The user shall be allowed to specify the starting website (if none is specified, will be used)  WCRI The user shall have the ability to specify the number of back-links required for a website to be maintained in the final list  WCRI The user shall have the ability to specify a log file in which to save the results of the crawl  WCRI The user shall be allowed to specify the maximum number of websites to crawl before stopping  WCRI The user shall be allowed to stop the crawl at any time before it finishes  WCRI The user shall be notified when the crawl is complete  WCRI The user shall be kept apprised of the total number of pages left to crawl  WCRI The user shall be kept apprised of the total number of pages crawled  WCRI The crawler shall follow the robot exclusionary protocol  WCRI The crawler shall use multiple threads to avoid putting too much stress on an individual web host  WCRI The user shall have an option to search only within the specified domain Bolded Requirements represent Critical Project Requirements

Web Search Screenshot

Web Search Requirements  WSRI The user shall be allowed to search over previously crawled web pages  WSRI The user shall have a box to enter search terms  WSRI The user shall be allowed to specify the minimum number of back- links required for a page containing the search term to be considered a match  WSRI The URLs that match the search terms shall be sorted in order of number of back-links  WSRI The URLs that match the search terms shall be displayed in a scrollable text box Bolded requirements represent Critical Project Requirements

Entity Search Screenshot

Entity Search Requirements  ESRI The user shall have the ability to search for entities from previously crawled websites  ESRI The user shall have a box to enter search terms  ESRI There shall entities for at a minimum: address, phone number, fax number, street address, and zip code  ESRI There shall be an overarching entity that gathers all contact info  ESRI The entity search results shall be ranked based on highest score  ESRI The user shall be allowed to specify search terms in addition to entity terms  ESRI The entities that match the search terms shall be displayed in a scrollable text box Bolded requirements represent Critical Project Requirements

Project Schedule  Key Dates Presentation 1: November 13 Presentation 2: February 15 Presentation 3:April 25

Project Schedule (cont).

Cost Estimation Formulas  Intermediate COCOMO Important Formulas: Effort = 3.2 * EAF * (KLOC) 1.05  EAF represents Effort Adjustment Factor  KLOC represents Source Lines of Code (in thousands) Time (in months) = 2.5 * (Effort) 0.38

Effort Adjustment Factors IdentifierClassificationValueReasoning RELYLow0.88Project is not safety critical, and does not have to be completely reliable DATAHigh1.08A large # of web pages are needed in order to perform a thorough search CPLXNominal1.00Web crawling, Web Search, and Entity Search are not overly complicated TIMENominal1.00Response time is important yet not overly critical STORVery High1.21Crawling and searching will require a lot of memory usage VIRTLow0.87Low complexity of the hardware and software TURNLow0.87Since this is a single developer project, the turnaround time is low ACAPHigh0.86Developer has 4+ years experience in software engineering AEXPHigh0.91Developer has 3+ years experience in applications development PCAPHigh0.86Developer has applicable experience VEXPNominal1.00Developer has 2+ years experience developing for Java virtual machine LEXPHigh0.95Developer has 2+ years experience developing using Java TOOLNominal1.00Moderate experience with tools being used MODPVery High0.83Developer has 4+ years exp. using modern software engineering practices SCEDNominal1.00Project has a tight schedule, but some slippage is allowable

Cost Estimation Calculations  KLOC estimated at: 2 Based on other available web crawlers + searchers  EAF = 0.95  Effort = 6.29 Staff-months  Time = 5.04 Chronological months Result: Project should be able to be accomplished within 2 semesters.

Software Quality Assurance Plan  References Vision Plan Project Plan IEEE Standard for Software Quality Assurance Planning IEEE Guide for Software Quality Assurance Planning  Supervisory Committee Dr. Scott DeLoach Dr. David Gustafson Dr. William Hsu  Major Professor Dr. William Hsu  Developer Eric Davis  Formal Technical Inspectors TBD

Software Quality Assurance Plan (cont).  Documentation A listing of the required documentation is available at: Project Documentation will be available at:

Software Quality Assurance Plan (cont).  Standards, Practices, Conventions & Metrics Documentation – IEEE standards will be followed for all applicable documentation Coding – Java naming conventions + Javadoc will be used Metrics – COCOMO will be used to measure project effort  Reviews & Audits Supervisory committee will review all documentation at each milestone TBD Formal Technical Inspectors will review the architecture before the second presentation

Software Quality Assurance Plan (cont).  Testing Defined in Software Test Plan Will be available by Presentation 2  Problem Reporting Issues will be tracked in a spreadsheet All issues will be reported to Major Professor

Software Quality Assurance Plan (cont).  Tools, Technologies, & Methodologies Eclipse IDE – for software development Eclipse FatJar – for building executable JAR files Eclipse Jigloo Plugin – for GUI development Microsoft Word – for documentation development Microsoft Excel – for risk and problem report tracking and time logs Microsoft Powerpoint – for project presentation creation Adobe Acrobat – for document conversion to PDF Microsoft Project – for project planning Microsoft Visio – for software design development USE – for developing formal specifications

Software Quality Assurance Plan (cont).  Code & Media Control CVS will be used for source code control Repository is available at: Change logs will be maintained for all documents Versions will be maintained on the developer’s computer

Prototype Demonstration

Phase 2 Deliverables  Vision Plan 2.0  Project Plan 2.0  Architectural Design Document  Software Test Plan 1.0  Technical Inspection List  Presentation 2  Prototype 2.0 Source Code

Current Obstacles / Questions  Technical Inspectors Two are still needed  Masters Degree Final Examination?

Questions / Comments