Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data: Automatic hotel prices collection on the Internet for the Tourism Survey in the Basque Country EUSTAT. Euskal Estatistika Erakundea – Basque.

Similar presentations


Presentation on theme: "Big Data: Automatic hotel prices collection on the Internet for the Tourism Survey in the Basque Country EUSTAT. Euskal Estatistika Erakundea – Basque."— Presentation transcript:

1 Big Data: Automatic hotel prices collection on the Internet for the Tourism Survey in the Basque Country EUSTAT. Euskal Estatistika Erakundea – Basque Statistics Office Elena Goni Jorge Aramendi Anjeles Iztueta Javier San Vicente New Techniques and Technologies for Statistics (NTTS) Brussels, March 2017 1 1

2 Tourism Occupancy Survey (ETR)
ADR: Average Daily Rate for an occupied double room 2 2

3 Background Tourism Occupancy Survey (ETR) Monthly survey
Temporary sampling scheme on all the units in our Tourism Register Strata: geographical areas and hotel category Estimates: travelers and night stays, accommodation offered, occupancy rate, average stay, employees. Target in 2017, information about prices: The Average Daily Rate (ADR) is the price for an occupied double room, just for the accommodation, and no other services and / or taxes, and the Revenue per Available Room (REVPAR) 3 3 3

4 Prices in the survey vs. prices on the Web
4 4

5 Web scraping Collaboration with the Information Technology Faculty,
Basque Country University (EHU-UPV). import.io Booking, 1,000 hotels and establishments Standard double room, no breakfast, no taxes Target: prices for only hotels and pensions, 556 establishments Prices for 7 and 14 days in advance Data-collection period: 45 days, 27 of January to 11 of March 2016, records 5 5 5

6 Problems capturing the data
Lack of information for some hotels / dates Connection errors Problems for making several queries in a short time Booking modifies the structure of the Website Outliers, errors and duplicates Hotel names are not permanent, they may change 6 6 6

7 Frame coverage Table 1. Coverage ratio of Booking to the Tourism Register by hotel category and geographical area, in hotel bed-places 7 7

8 Output of the scraping Figure 1. Coverage ratio of Booking over the Tourism Register by date, in establishments and hotel bed-places 8 8

9 Output of the scraping Average price in Euros TOTAL 73 73 € H5 177 165
Table 2. Average price by hotel category by number of days in advance Average price in Euros 7 days 14 days TOTAL 73 73 € H5 177 165 H4 102 H3 77 H2 76 H1 66 P2 61 62 P1 49 9 9

10 Comparing Booking – Tourism Survey
Figure 2. Average ADR, Online Tour Operator and Booking prices by categories, Feb-Mar 2016 10 10

11 Comparing Booking – Tourism Survey
Figure 3. Booking average price and average ADR by categories and month 11 11

12 Conclusions - balance Cost? ¿ Easy? Integration? Quality?
Continuous monitoring Initial programing is intensive Affordable technology Cost is acceptable Reduction of the survey burden Scraping with difficulties Data editing necessary Data linkage Work calendars very strict and tight Hard efforts for integration in the production process Concepts harmonization necessary High coverage Scraping biases 12 12

13 More questions More data sources
Obtain prices for everyday, several prices a day (Summer, Easter,…) 4 or 3 months ahead 13 13

14 www.eustat.eus Eskerrik asko Elena Goni (elena_goni@eustat.eus)
Jorge Aramendi Anjeles Iztueta Javier San Vicente 14 14


Download ppt "Big Data: Automatic hotel prices collection on the Internet for the Tourism Survey in the Basque Country EUSTAT. Euskal Estatistika Erakundea – Basque."

Similar presentations


Ads by Google