SOCCER DATA WEB CRAWLER

Slides:



Advertisements
Similar presentations
Ninth Lecture Hour 8:30 – 9:20 pm, Thursday, September 13
Advertisements

University of Southern California Center for Systems and Software Engineering A Look at Software Engineering Risks in a Team Project Course Sue Koolmanojwong.
SOCIAL NETWORK INFORMATION CONSOLIDATION Developers:  Klasquin Tomer  Nisimov Yaron  Rabih Erez Advisors:  Academic: Prof. Elovici Yuval  Technical:
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
City of LA Personnel Department Mobile Application Team 02 1.
LA Commons Upgrade of Website ARB Team 01. Name Role Hualong Zu Project Manager Qihua WuLife Cycle Planner Taizhi LiRequirements Engineer Huaiqi WangPrototyper.
Team15 - GOTRLA Girls On The Run Los Angeles. Team-15 Deepak Earayil : Project Manager & System/Software Architect Ankith Nagarle : Prototyper & Operational.
Windows Azure Team 9 Ben Holland Bao Nguyen Eric Petrowiak Barret Schloerke.
1 ISA&D7‏/8‏/ ISA&D7‏/8‏/2013 Systems Development Life Cycle Phases and Activities in the SDLC Variations of the SDLC models.
Example  Software for a virtual library (borrowing books using the Internet) Internt terminal DB.
Elockbox Team08 Fall2014 Jian Lei Role(s): Project Manager / Builder Da Lu Role(s): Prototyper / System/Software Architect Cheng Role(s):Feasibility Analyst.
Web Categorization Crawler Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Design & Architecture Dec
Healthy Kids Zone Team Operational Concept Description Xu Zhang 2.
T Project Review X-tremeIT I1 Iteration
TRANSITION READINESS REVIEW GOTRLA TEAM 15 Aayush Jain, Ankith Nagarle, Anushila Dey, Deepak Earayil, Elaine Lo, Nidhi Baheti, Presha Thakkar, Suhani Vyas.
Slide 1 Project Management Chapter 4. Slide 2 Objectives ■ Become familiar with estimation. ■ Be able to create a project workplan. ■ Become familiar.
Distributed Software Development QR Marks The Spot Beta Prototype Vadym Khatsanovskyy, Nicolas Jacquemoud.
Joint Educational Project ONLINE PLATFORM Shreya NigamProject Manager/Prototyper Reem AlfayezRequirement Engineer Rebecca LinFeasibility Analyst Wei YanSystem.
Systems Analysis and Design in a Changing World, Fourth Edition
Software Life Cycle The software life cycle is the sequence of activities that occur during software development and maintenance.
University of Southern California Center for Systems and Software Engineering 577 process CSCI 577a Software Engineering I Supannika Koolmanojwong Mobasser.
First Hand News Siu Lun Hong Meenakshi Lakshmikanthan Abirami Mangai.
Program Assessment User Session Experts (PAUSE) Information Sessions: RSS & Subscription Services October , 2006.
Thrdplace Social Networking Team #7 1. TRR Outline Operational Concept Overview System benefits to Customer 1.Introduction Demo of System Operational.
11 BlackProfessionals.net DCR ARB Presentation Team 6 December 1, 2014.
University of Southern California Center for Systems and Software Engineering RDCR ARB CS 577b Software Engineering II Supannika Koolmanojwong.
LOSE4GOOD.ORG (BY TEAM 08) PROMOTE HEALTHY LIVING.
The Share Web Team 5.
Chapter 11 Project Management.
TRR/ARB Team 9: TipSure.com.
... Transform young lives through Music
T Project Review X-tremeIT I1 Iteration
Methodologies and Algorithms
Software Engineering Management
SOCCER DATA WEB CRAWLER
Systems Analysis and Design in a Changing World, 4th Edition
Image Processing Platform
Chapter 6: Database Project Management
ShareTheTraining TRR ARB Presentation Team 11
DCR ARB Presentation Team 5: Tour Conductor.
TEAM 15 Joint Educational Project ONLINE PLATFORM
T Project Review Group: pdm I2 Iteration
Mission Science By Team 07.
Girls On The Run Los Angeles
SKILL ASSESSMENT OF SOFTWARE TESTERS Case Study
City of LA Personnel Department Mobile Application
Seminar CS2310 Multimedia Software Engineering Krithika Ganesh
TEAM 02 Prototype Presentation
FCR ARB Presentation- Team 07
COMP 350: Object Oriented Analysis and Design Lecture 2
Frenzy TRR ARB Presentation
E-Lockbox DCR ARB Client: Living Advantage, Inc.
Team 07-Fuppy Krupa Patel Adil Assouab Yiyuan Chen(Kevin)
Farmworkers Safety System
Team 05 SnApp Voice Communication System
Chapter 1 (pages 4-9); Overview of SDLC
Team 05 SnApp Voice Communication System
CSCI 577b Tasks and Activities
Introducing ISTQB Agile Foundation Extending the ISTQB Program’s Support Further Presented by Rex Black, CTAL Copyright © 2014 ASTQB 1.
ARB Schedule Locations
FCR ARB Presentation- Team 07
Team 05 SnApp Voice Communication System
CS577a Software Engineering ARB #2 Workshop
Family Proud TRR ARB Presentation
Team 7- SCRIPTONOMICS Advanced movie script analytics made simple
Transition Readiness Review
Software Testing Lifecycle Practice
Transition Readiness Review
Team 7- SCRIPTONOMICS Advanced movie script analytics made simple
Presentation transcript:

SOCCER DATA WEB CRAWLER (By Team 02) copyrights@SporTech B.I. 11/17/2018

Team members Trupti Sardesai - Program Manager Zhitao Zhou - Feasibility Analyst Subessware Karunamoorthy - System Architect Wenchen Tu - Prototyper Qing Hu - Life Cycle Planner Yan Zhang - Operational Concept Engineer Pranshu Kumar - Requirements Engineer Amir Ali Tahmasebi - Shaper copyrights@SporTech B.I. 11/17/2018

outline Team Strength and Weakness Overall evaluation Operational Concept Design Requirements SSAD Life Cycle Plan Feasibility Evidence Quality Focal Point Test Cases Final Product Demonstration copyrights@SporTech B.I. 11/17/2018

Less knowledge about associated technologies Goal-driven Team Weakness Team Strength Schedule conflicts Communication Work Overlap Collaboration Less knowledge about associated technologies Goal-driven copyrights@SporTech B.I. 11/17/2018

Overall Project Evaluation Identified new requirements Identified new risks with evolution of Project Developed all the agreed to win condition Developed final Product copyrights@SporTech B.I. 11/17/2018

Operational concept design copyrights@SporTech B.I. 11/17/2018

Current Business workflow copyrights@SporTech B.I. 11/17/2018

System purpose: Organizational goals OG-1: To enable the end users to make a well-informed knowledge about the players/team. OG-2: To increase time-saving to increase operational efficiency. OG-3: To increase accessibility of real-time data/information. copyrights@SporTech B.I. 11/17/2018

Current Business workflow copyrights@SporTech B.I. 11/17/2018

Proposed Business Workflow copyrights@SporTech B.I. 11/17/2018

CAPABILITY GOALS OC-1 Crawl predefined websites: The web crawler shall gather team information from the websites in the website list. OC-2 Crawl predefined websites: The web crawler shall gather player information from the websites in the website list. OC-3 Crawl Social Media: The web crawler shall get comments, name and number of members, likes from specified Facebook pages. OC-4 Crawl Social Media: The web crawler shall get number of followers, the comments and the number of retweets for a specified twitter account. copyrights@SporTech B.I. 11/17/2018

CAPABILITY GOALS OC-5 Ingest Data: The crawler shall ingest crawled data into PostgreSQL database. OC-6 STBI Contractor UI: As a STBI contractor, I can update/revise the player data as the season progresses. OC-7 STBI Contractor UI: As a STBI contractor, I can add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. copyrights@SporTech B.I. 11/17/2018

Level of service LOS 1 Flexibility: The system can crawl and scrape any given URL into database. LOS 2 Efficiency: The system can crawl and scrape Facebook and Twitter data for a player in a time proportional to the amount of comments and post the player’s account has. The system can crawl and scrape specific website in an hour averagely. copyrights@SporTech B.I. 11/17/2018

Requirements copyright@SporTech B.I. 11/17/2018

WIN CONDITION SUCCESS # Capability Goals Priority Level Success/Fail OC1 Crawl predefined websites: The web crawler shall gather team information from the websites in the website list. Must have (Agreed to) SUCCESS OC2 Crawl predefined websites: The web crawler shall gather player information from the websites in the website list. OC3 Crawl Social Media: The web crawler shall get comments, name and number of members, likes from specified Facebook pages. OC4 Crawl Social Media: The web crawler shall get number of followers, the comments and the number of retweets for a specified twitter account. copyrights@SporTech B.I. 11/17/2018

# Capability Goals Priority Level Success/Fail OC5 Ingest Data: The crawler shall ingest crawled data into PostgreSQL database. Must have (Agreed to) SUCCESS OC6 STBI Contractor UI: As a STBI contractor, I can update/revise the player data as the season progresses. OC7 STBI Contractor UI: As a STBI contractor, I can add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. copyrights@SporTech B.I. 11/17/2018

# Capability Goals Priority Level Success/Fail OC8 Crawl Social Media: The web crawler shall gather Instagram pictures, number of likes and the comments from particular Instagram account. Would Like (Potentially Agree) FAIL OC9 Crawl predefined websites: The web crawler shall gather videos from the pages being crawled and ingest into STBI as is so that the coach and fans is able to watch the relevant videos. Would like OC10 Crawl Social Media: The web crawler shall crawl YouTube to gather videos of specific players. Would like copyrights@SporTech B.I. 11/17/2018

SSAD copyrights@SporTech B.I. 11/17/2018

USE CASE DIAGRAM copyrights@SporTech B.I. 11/17/2018

Design Class Diagram copyrights@SporTech B.I. 11/17/2018

SEQUENCE DIAGRAM- Website copyrights@SporTech B.I. 11/17/2018

SEQUENCE DIAGRAM- FACEBOOK copyrights@SporTech B.I. 11/17/2018

LIFE CYCLE PLAN copyrights@SporTech B.I. 11/17/2018

ITERATION PLAN # Capability Priority Iteration OC-1, OC-2 Retrieve team and player data from specific website High 1 OC-3, OC-4 Retrieve data from Facebook and Twitter OC-5 Storing data into Postgres database 2 OC-6, OC-7 Develop user- interface for the developer TC-01-01, TC-02-01 Test if the web crawler is able to gather team and player information. TC-03-01 Integration Test 3 copyrights@SporTech B.I. 11/17/2018

copyrights@SporTech B.I. 11/17/2018

EFFORT ESTIMATION copyrights@SporTech B.I. 11/17/2018

FEASIBILITY EVIDENCE copyrights@SporTech B.I. 11/17/2018

Activities Time Spent (Hours) Nonrecurring Cost   Initial Client Meeting(1 meeting * 2 people * 1.5 hours + 1 meeting * 1 person * 1hours) 4 Win-Win Negotiation Meetings(2 meeting * 2 people * 2 hours + 1 meeting * 1 person * 2 hours) 10 Communication with development team (2 people * 2 hours/week * 12 weeks) 48 Architecture Review Boarding Meeting(1 meeting * 1 person* 1 hours) 1 Weekly status update(2 people * 0.5 hours * 4 weeks) Training STBI contractors 2 Total Time: 69 Cost:(Estimation of $431/hour) $2,968 copyrights@SporTech B.I. 11/17/2018

Current activities & resources used Money Saved (Dollars/Year) % Reduce Money Saved (Dollars/Year) Nonrecurring Benefit Manual Data Entry & Data Ingestion 80 19,5002 Recurring Benefit Manual Data Entry & Data Ingestion & Update database Total 19,5003 copyrights@SporTech B.I. 11/17/2018

Benefit (Effort Saved) Year Cost Benefit (Effort Saved) Cumulative Cost Cumulative Benefit ROI 2014 2968 2,968 -1 2015 6,5001 19,500 9,468 1.06 2016 7,1501 21,450 16,618 40,950 1.46 2017 7,8651 23,595 24,483 64,545 1.67 2018 8,6511 25,954 33,134 90,499 1.73 copyrights@SporTech B.I. 11/17/2018

Risks Risk Exposure Risk Mitigations Potential Magnitude Probability Loss One player may have information on different website and two players may have same name on different websites, causing data duplication or data inaccuracy in the data base. 5 3 15 Mark the source of data when ingested into database, use an attribute duplicate to indicate whether there exists a duplicate for this player and STBI contractor will figure this duplicate by human intervention. Because the posts and comments for a posts may be a very long list, the efficiency of fetching a player’s data from Facebook is low. 7 21 Set a timestamp of late 6 months for fetching posts and comments for posts for a player. copyrights@SporTech B.I. 11/17/2018

QUALITY FOCAL POINT copyrights@SporTech B.I. 11/17/2018

METRIC- BURN DOWN CHART copyrights@SporTech B.I. 11/17/2018

METRIC- TEST PASS COVERAGE copyrights@SporTech B.I. 11/17/2018

TECHNICAL DEBT Causes Solutions Lack of Domain Experience (Python and PostgreSQL) Vague Requirements Technology Volatility Solutions Learning and Training Win book and Negotiation Choose the most stable API and prototype copyrights@SporTech B.I. 11/17/2018

TRACEABILITY MATRIX OCD Requirement Win Condition SSAD/ Use case Test Cases OC-1 Crawl predefined websites to gather team information WC_3473 UC02 TC-01-01 OC-2 Crawl predefined websites to gather player information   WC_3472 UC05 TC-02-01 OC-3 Crawl Social Media- Facebook WC_3416 UC06 OC-4 Crawl Social Media- Twitter WC_3417 OC-5 Ingest Data into PostgreSQL Database WC_3495 TC-03-01 OC-6 STBI Contractor UI to update the player data WC_3398 UC06, UC04, UC08 OC-7 STBI Contractor UI to add, delete, update the specific websites visited, fields to capture from the website and frequency of crawler refreshes for each specified website. UC01, UC02, UC03, UC07  TC-01-01 copyrights@SporTech B.I. 11/17/2018

TEST IDENTIFICATION Test Identifier- TC-01 Gather team information Test Level -Software item level Test Class-Capability Test Test Completion Criteria-Team Information should be gathered from a webpage correctly and match the expected information that we have gathered by hand. copyrights@SporTech B.I. 11/17/2018

TEST CASE copyrights@SporTech B.I. 11/17/2018

TEST IDENTIFICATION Test Identifier - TC-02 Gather player information Test Level-Software item level Test Class-Capability Test Test Completion Criteria- Player Information should be gathered from a webpage correctly and match the expected information that we have gathered by hand. copyrights@SporTech B.I. 11/17/2018

TEST CASE copyrights@SporTech B.I. 11/17/2018

TEST IDENTIFICATION Test Identifier - TC-03 Update player information Test Level- Software item level Test Class-Capability Test Test Completion Criteria-When player information is updated, the data in the DB should match the updated data. copyrights@SporTech B.I. 11/17/2018

TEST CASE copyrights@SporTech B.I. 11/17/2018

FINAL PRODUCT DEMO copyrights@SporTech B.I. 11/17/2018

THANK YOU!!! copyrights@SporTech B.I. 11/17/2018