Download presentation
Presentation is loading. Please wait.
1
SOCCER DATA WEB CRAWLER
(By Team 02) B.I. 11/17/2018
2
Team members Trupti Sardesai - Program Manager Zhitao Zhou - Feasibility Analyst Subessware Karunamoorthy - System Architect Wenchen Tu - Prototyper Qing Hu - Life Cycle Planner Yan Zhang - Operational Concept Engineer Pranshu Kumar - Requirements Engineer Amir Ali Tahmasebi - Shaper B.I. 11/17/2018
3
outline Team Strength and Weakness Overall evaluation
Operational Concept Design Requirements SSAD Life Cycle Plan Feasibility Evidence Quality Focal Point Test Cases Final Product Demonstration B.I. 11/17/2018
4
Less knowledge about associated technologies Goal-driven
Team Weakness Team Strength Schedule conflicts Communication Work Overlap Collaboration Less knowledge about associated technologies Goal-driven B.I. 11/17/2018
5
Overall Project Evaluation
Identified new requirements Identified new risks with evolution of Project Developed all the agreed to win condition Developed final Product B.I. 11/17/2018
6
Operational concept design
B.I. 11/17/2018
7
Current Business workflow
B.I. 11/17/2018
8
System purpose: Organizational goals
OG-1: To enable the end users to make a well-informed knowledge about the players/team. OG-2: To increase time-saving to increase operational efficiency. OG-3: To increase accessibility of real-time data/information. B.I. 11/17/2018
9
Current Business workflow
B.I. 11/17/2018
10
Proposed Business Workflow
B.I. 11/17/2018
11
CAPABILITY GOALS OC-1 Crawl predefined websites: The web crawler shall gather team information from the websites in the website list. OC-2 Crawl predefined websites: The web crawler shall gather player information from the websites in the website list. OC-3 Crawl Social Media: The web crawler shall get comments, name and number of members, likes from specified Facebook pages. OC-4 Crawl Social Media: The web crawler shall get number of followers, the comments and the number of retweets for a specified twitter account. B.I. 11/17/2018
12
CAPABILITY GOALS OC-5 Ingest Data: The crawler shall ingest crawled data into PostgreSQL database. OC-6 STBI Contractor UI: As a STBI contractor, I can update/revise the player data as the season progresses. OC-7 STBI Contractor UI: As a STBI contractor, I can add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. B.I. 11/17/2018
13
Level of service LOS 1 Flexibility: The system can crawl and scrape any given URL into database. LOS 2 Efficiency: The system can crawl and scrape Facebook and Twitter data for a player in a time proportional to the amount of comments and post the player’s account has. The system can crawl and scrape specific website in an hour averagely. B.I. 11/17/2018
14
Requirements B.I. 11/17/2018
15
WIN CONDITION SUCCESS # Capability Goals Priority Level Success/Fail
OC1 Crawl predefined websites: The web crawler shall gather team information from the websites in the website list. Must have (Agreed to) SUCCESS OC2 Crawl predefined websites: The web crawler shall gather player information from the websites in the website list. OC3 Crawl Social Media: The web crawler shall get comments, name and number of members, likes from specified Facebook pages. OC4 Crawl Social Media: The web crawler shall get number of followers, the comments and the number of retweets for a specified twitter account. B.I. 11/17/2018
16
# Capability Goals Priority Level Success/Fail
OC5 Ingest Data: The crawler shall ingest crawled data into PostgreSQL database. Must have (Agreed to) SUCCESS OC6 STBI Contractor UI: As a STBI contractor, I can update/revise the player data as the season progresses. OC7 STBI Contractor UI: As a STBI contractor, I can add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. B.I. 11/17/2018
17
# Capability Goals Priority Level Success/Fail
OC8 Crawl Social Media: The web crawler shall gather Instagram pictures, number of likes and the comments from particular Instagram account. Would Like (Potentially Agree) FAIL OC9 Crawl predefined websites: The web crawler shall gather videos from the pages being crawled and ingest into STBI as is so that the coach and fans is able to watch the relevant videos. Would like OC10 Crawl Social Media: The web crawler shall crawl YouTube to gather videos of specific players. Would like B.I. 11/17/2018
18
SSAD B.I. 11/17/2018
19
USE CASE DIAGRAM B.I. 11/17/2018
20
Design Class Diagram B.I. 11/17/2018
21
SEQUENCE DIAGRAM- Website
B.I. 11/17/2018
22
SEQUENCE DIAGRAM- FACEBOOK
B.I. 11/17/2018
23
LIFE CYCLE PLAN B.I. 11/17/2018
24
ITERATION PLAN # Capability Priority Iteration OC-1, OC-2
Retrieve team and player data from specific website High 1 OC-3, OC-4 Retrieve data from Facebook and Twitter OC-5 Storing data into Postgres database 2 OC-6, OC-7 Develop user- interface for the developer TC-01-01, TC-02-01 Test if the web crawler is able to gather team and player information. TC-03-01 Integration Test 3 B.I. 11/17/2018
25
copyrights@SporTech B.I.
11/17/2018
26
EFFORT ESTIMATION B.I. 11/17/2018
27
FEASIBILITY EVIDENCE B.I. 11/17/2018
28
Activities Time Spent (Hours)
Nonrecurring Cost Initial Client Meeting(1 meeting * 2 people * 1.5 hours + 1 meeting * 1 person * 1hours) 4 Win-Win Negotiation Meetings(2 meeting * 2 people * 2 hours + 1 meeting * 1 person * 2 hours) 10 Communication with development team (2 people * 2 hours/week * 12 weeks) 48 Architecture Review Boarding Meeting(1 meeting * 1 person* 1 hours) 1 Weekly status update(2 people * 0.5 hours * 4 weeks) Training STBI contractors 2 Total Time: 69 Cost:(Estimation of $431/hour) $2,968 B.I. 11/17/2018
29
Current activities & resources used Money Saved (Dollars/Year)
% Reduce Money Saved (Dollars/Year) Nonrecurring Benefit Manual Data Entry & Data Ingestion 80 19,5002 Recurring Benefit Manual Data Entry & Data Ingestion & Update database Total 19,5003 B.I. 11/17/2018
30
Benefit (Effort Saved)
Year Cost Benefit (Effort Saved) Cumulative Cost Cumulative Benefit ROI 2014 2968 2,968 -1 2015 6,5001 19,500 9,468 1.06 2016 7,1501 21,450 16,618 40,950 1.46 2017 7,8651 23,595 24,483 64,545 1.67 2018 8,6511 25,954 33,134 90,499 1.73 B.I. 11/17/2018
31
Risks Risk Exposure Risk Mitigations Potential Magnitude Probability Loss One player may have information on different website and two players may have same name on different websites, causing data duplication or data inaccuracy in the data base. 5 3 15 Mark the source of data when ingested into database, use an attribute duplicate to indicate whether there exists a duplicate for this player and STBI contractor will figure this duplicate by human intervention. Because the posts and comments for a posts may be a very long list, the efficiency of fetching a player’s data from Facebook is low. 7 21 Set a timestamp of late 6 months for fetching posts and comments for posts for a player. B.I. 11/17/2018
32
QUALITY FOCAL POINT B.I. 11/17/2018
33
METRIC- BURN DOWN CHART
B.I. 11/17/2018
34
METRIC- TEST PASS COVERAGE
B.I. 11/17/2018
35
TECHNICAL DEBT Causes Solutions
Lack of Domain Experience (Python and PostgreSQL) Vague Requirements Technology Volatility Solutions Learning and Training Win book and Negotiation Choose the most stable API and prototype B.I. 11/17/2018
36
TRACEABILITY MATRIX OCD Requirement Win Condition SSAD/ Use case
Test Cases OC-1 Crawl predefined websites to gather team information WC_3473 UC02 TC-01-01 OC-2 Crawl predefined websites to gather player information WC_3472 UC05 TC-02-01 OC-3 Crawl Social Media- Facebook WC_3416 UC06 OC-4 Crawl Social Media- Twitter WC_3417 OC-5 Ingest Data into PostgreSQL Database WC_3495 TC-03-01 OC-6 STBI Contractor UI to update the player data WC_3398 UC06, UC04, UC08 OC-7 STBI Contractor UI to add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. UC01, UC02, UC03, UC07 TC-01-01 B.I. 11/17/2018
37
TEST IDENTIFICATION Test Identifier- TC-01 Gather team information
Test Level -Software item level Test Class-Capability Test Test Completion Criteria-Team Information should be gathered from a webpage correctly and match the expected information that we have gathered by hand. B.I. 11/17/2018
38
TEST CASE B.I. 11/17/2018
39
TEST IDENTIFICATION Test Identifier - TC-02 Gather player information
Test Level-Software item level Test Class-Capability Test Test Completion Criteria- Player Information should be gathered from a webpage correctly and match the expected information that we have gathered by hand. B.I. 11/17/2018
40
TEST CASE B.I. 11/17/2018
41
TEST IDENTIFICATION Test Identifier - TC-03 Update player information
Test Level- Software item level Test Class-Capability Test Test Completion Criteria-When player information is updated, the data in the DB should match the updated data. B.I. 11/17/2018
42
TEST CASE B.I. 11/17/2018
43
FINAL PRODUCT DEMO B.I. 11/17/2018
44
THANK YOU!!! B.I. 11/17/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.