WP2 Internal Meeting 15:00-15:30 Next Milestones and proposed workplan 15:30-16:30 Round Table Discussion 16:30-17:30 SGA2 organization and discussion
Deliverables for SGA - 1 Deliverables Due date Submission to Reviewing Board Report with legal aspects Month 12 (January 2017) December 15 2016 Technical and methodological report describing web scraping, prediction and inference procedures Month 18 (July 2017) June 15 2017
Final SGA1 Report Structure General Motivations for Web Scraping of Enterprises Web Sites (WS-EWS) Use cases Description of a framework for WS-EWS Logical building blocks Mapping of pilots to logical building blocks Methodological Issues Specific vs generic scraping: when/what Analysis techniques: machine learning vs. deterministic approaches Review of methods used in pilots Technological issues Review of tech environments used in pilots Issues. E.g. scalability Appendixes by use cases
Final SGA1 Report Structure General Motivations for Web Scraping of Enterprises Web Sites (WS-EWS) Use cases Description of a framework for WS-EWS Logical building blocks Mapping of pilots to logical building blocks Methodological Issues Specific vs generic scraping: when/what Analysis techniques: machine learning vs. deterministic approaches Review of methods used in pilots Technological issues Review of tech environments used in pilots Issues. E.g. scalability Appendixes by use cases
Final SGA1 Report Structure General Motivations for Web Scraping of Enterprises Web Sites (WS-EWS) BG Use cases Description of a framework for WS-EWS IT Logical building blocks Mapping of pilots to logical building blocks Methodological Issues NL+IT Specific vs generic scraping: when/what Analysis techniques: machine learning vs. deterministic approaches Review of methods used in pilots Technological issues PL Review of tech environments used in pilots Issues, e.g. scalability Appendixes by use cases SE and UK?
Agreement to fill the pilot template by Mid/End of April Dates Dates? Dates? Dates? Dates? Dates? Dates? Agreement to fill the pilot template by Mid/End of April Use Cases\Countries IT SE UK NL BG PL 1: URLs retrieval x 2: Ecommerce 3: Job advertisement 4: Social Media
Workplan March 2017 April 2017 May 2017 June 2017 July 2017 Preparation of template for pilots Pilots Appendix (section 5) Use case (section 1) Framework (section 2) Methodology (section 3) IT
SGA 2 Official start date : August 2017 New Use cases To be evaluated sustainability reporting on enterprises’ websites identifying categories relevant to Enterprises’ types of activity (NACE) To be evaluated web sites accessibility support to Euro group register
New Methods Testing information extraction techniques and Applicability of Findings NLP Deep learning
SGA 2 Milestone Del Due Date Review Board 2.1 Final report describing final procedures set up for accessing Enterprises web sites and use them for the different uses cases May 2018 15 April 2018
SGA2 Meeting SGA2 Meetings Internal workshop on enterprise web scraping (Oct 2017) Joint internal workshop of WP 1 and WP 2, with report (March 2018)