PROJECT OMNIGLEAN Team Members: Kenny Trytek Derek Woods Abby Birkett Joe Briggie Advisor: Simanta Mitra Client: Kingland Systems
PROBLEM STATEMENT Large companies have many layers of corporate hierarchy. Financial and data records sometimes conflict between various layers/entities. Accurate and comprehensive company records are needed. There is a need for “Data Mastering”, to take multiple conflicting sources of data and determine the reality of the matter in conflict.
CONCEPT SKETCH Client User Analyst User Omniglean Internet Harvest & store Figure 2.1.1:Concept Sketch
FUNCTIONAL REQUIREMENTS System shall autonomously traverse publicly available websites System shall parse information from downloaded file in portable document format (pdf) System shall store parsed information in a flat file System shall allow user the ability to create, modify, and delete records System shall maintain a normalized database System shall expose functionality through web services
NON-FUNCTIONAL REQUIREMENTS System shall support up to 250 concurrent users A single run of system shall complete execution in less than six hours System shall be easily extensible to include more websites than originally specified System shall be completed by May of 2011
CONSTRAINTS AND TECHNICAL CONSIDERATIONS System shall interact with a third party library to facilitate database interaction The database may not be available at all times Using SVN to manage code Using the spiral design process
MARKET SURVEY Omniglean provides a unique combination of access to freely available FDIC and FFIEC data through a data mastering suite. Omniglean provides access to the mastering capability through web services, to enable a rapid delivery of functionality to customers of Kingland Systems, as well as analysts located in different geographical areas.
POTENTIAL RISKS AND MITIGATION External format change External availability New technologies Web services Not enough time for testing and debugging Team members
COST AND RESOURCE ESTIMATE ItemCost Reporting Poster Materials$50.00 Report Materials$50.00 $20.00/hr Kenny Trytek$3,560 Abby Birkett$3,220 Joe Briggie$3,160 Derek Woods$3,360 Total$13,100
PROJECT MILESTONES Complete modules related to harvesting and transforming the data. Complete web services and user interface modules Integrate all modules successfully
FUNCTIONAL DECOMPOSITION Harvester – Gathers data ETL – Transforms data DAL – Database access layer Web services – Exposes data to external users User interfaces
SYSTEM DIAGRAM Flat File Database ETL Tool Normalized Kingland Data Analyst UI DAL No Conflicts? External Client UI Web Svcs. WWW Data Scraper Tool HTML Parser PDF Parser Create Read Update Delete
HARVESTER Scraper Flat File (XML) World Wide Web Parser PDF Parser HTML Parser startGatheringData() - Returns the XML document populated with data from the site. getLogFile() - Returns the log file that is either being written to or has been written to this session. stopGatheringData() - Stops all current harvester operations and writes an error to the log file.
ETL TOOL ETL ToolDAL Flat File (XML) loadFFIEC() - This method will load the data from the XML file into the FFIEC table. loadFDIC() - This method will load the data from the XML file into the FFDIC table. createORGANIZATION() - This method will take the information from both the FFIEC table and the FDIC table and put it into the ORGANIZATION table.
DATA ACCESS LAYER DAL Database User Interface ETL Tool Add() Find() Update() Delete() Organization- A class that creates and maintains a connection to the ORGANIZATION table GetConnection() CloseConnection() OrganizationService- This class will allow CRUD functionality with an Organization object. Find(String organizationId) Add(...) Delete(String organizationId) Update(String organizationId,...)
WEB SERVICES Unauthenticate d Authenticated Read() LogOut() Write() Update() Delete() Create() Read() Update() Delete() login() logout() Allows remote users to access the database through the Internet
TECHNOLOGY PLATFORM SQL Server 2008 Visual Studio 2010 development environment Microsoft Windows operating system WSDL and SOAP for web services
TEST PLAN The team will be testing the system in three phases. The first phase is testing the individual modules, the second phase will be testing the integration of the modules, and the final phase will be testing the system as a whole.
PROTOTYPING We have begun prototyping. The harvester is able to traverse the necessary websites easily. The ETL can read in XML files. The user interface has been mocked up.
CURRENT PROJECT STATUS Activity Start Date End Date Sept 13-19Sept Sept 27-Oct 3 Oct 4-10 Oct 11-17Oct 18-24Oct Nov 1-7 Nov 8-14 Nov 15-21Nov Nov 29-Dec 5 Dec 6-12 Dec 13-19Dec Dec 27-Jan 2 Jan 3-9 Jan 10-16Jan 17-23Jan Jan 31- Feb 6 Feb 7-13 Feb Feb Feb 28 – Mar 6 Mar 7-13 Mar 14-20Mar Mar 28 – Apr 3 Apr 4-10 Apr 11-17Apr Apr 25 – May 1 Project Plan Presentaion 09/21/1009/27/10 Project Plan Rough Draft 09/21/1010/05/10 Project Plan Final Draft 10/05/1010/12/10 Prototyping10/05/1012/10/10 Design Document Rough Draft 10/05/1011/15/10 Design Document Final Draft 11/15/1012/03/10 Testing Phase01/31/1104/04/11 System Completion 12/10/1004/04/11 End Product Documentation 03/21/1104/04/11 Project Poster03/14/1103/28/11 Project Presentation 04/25/11 Buffer04/11/114/24/11
TASK RESPONSIBILITY Kenny Trytek – Team Leader, responsible for the harvester Derek Woods – Developer, responsible for the ETL and UI Abby Birkett – Developer, responsible for database and DAL Joe Briggie – Developer, responsible for web services
PLAN FOR NEXT SEMESTER Continue prototyping Continue to meet with client to be sure we are meeting expectations Develop a more thorough test plan
QUESTIONS?