Analyzing Content-Level Properties of the Web Adversphere Yong Wang* **, Daniel Burgener**, Aleksandar Kuzmanovic**, Gabriel Maciá-Fernández*** * University of Electronic Science and Technology of China, Chengdu, Sichuan, China; ** Northwestern University, Evanston, IL, USA; *** University of Granada, Granada, Spain Introduction Location-based Advertising Behavioral Targeting Acknowledgements Motivation Advertising has become an integral and inseparable part of the World Wide Web. However, neither public auditing nor monitoring mechanisms still exist in this emerging area. Contributions We present our initial efforts on building a content- level auditing service for web-based ad networks. Understanding ad distribution mechanisms Evaluating behavioral targeting Evaluating location-based advertising Such a web ad auditing system, that bring useful auditing information to all entities involved in the online advertising business, can be universally applied to arbitrary commissioners’ networks to effectively monitor and help regulate web-based ad industry. Auditing Platform and Investigated Commissioners Auditing Platform We recruit 282 servers from PlanetLab, which are geographically distributed in 36 different countries. Investigated Commissioners We perform a content-level analysis of three representative ad networks with divergent design philosophies. Google Distributing a large number of data centers AOL- Adsonar Using CDN services Adblade Placing servers at a single location AOL has four subsidiaries: Adsonar, Advertising, Adtech, and Tacoda. Only Adsonar supports text- based ads that can be feasibly retrieved from the Web, as Google and Adblade do. Distribution Mechanisms Commissioners could send ads containing information related to the geographical location of the web users. We quantify the percentage of vantage points in which the use of location-based advertising is observed. (Table 3) Table 1: Global similarity of each commissioner CommissionerGlobal similarity (%) Google13.16 AOL-Adsonar Adblade72.62 ’Local’ similarity For each vantage point, we calculate the ’local’ similarity between itself and any other vantage point in terms of the percentage of identical ads observed in both vantage points. (Figure 1) ’Global’ similarity We also compute the ’global’ similarity of each commissioner by averaging the ’local’ similarities for all vantage points when accessing this commissioner’s ads. (Table 1) The darker the color in a given (x,y) box is, the larger the similarity between x and y is. Vantage points (a) Google(b) AOL- Adsonar (c) Adblade Similarites[%] Figure 1: Local similarities among vantage points Vantage points Summary of Findings Google distributes different ads into different servers. AOL-Adsonar distributes ads for four regions: U.S., Canada, U.K., and others. The servers in the same region serve the same content. Adblade uses a single machine (or a cluster) containing the whole pool of ads to serve the requests. Table 3: % of vantage points observing location-based ads CommissionerCity (%)State (%)No association (%) Google AOL-Adsonar Adblade Summary of Findings Google does business all over the world, so exploiting location-based advertising techniques is quite feasible. Adblade and AOL-Adsonar only apply location-based advertising in most areas of U.S.. Table 2: Incremental percentage of observed ’sport’ related ads when behavioral targeting is enabled (’local/uniform cookie’) comparing with disabled (’no cookie’) CommissionerLocal cookie (%) Uniform cookie (%) Google253 AOL-Adsonar135 Adblade00 Many commissioners claim to be able to more effectively reach users with behaviorally targeted ads. We want to examine the extent to which commissioners participate in behavioral targeting. We use the interest category “sports” in our tests. (Table 2) Establish baseline We disable cookies, and then retrieve the text-based ads from a list of websites, which may or may not be related to sports, and scan them for sports-related keywords, e.g., sport, cycling, etc. Establish browsing pattern We first enable cookies on our PlanetLab nodes and visit websites known to work with the commissioner that fit in the category “sports”. Local cookie We then repeat this experiment with cookies enabled, in order to determine the difference when behavioral targeting is used. Uniform cookie We finally repeat this experiment by copying the cookies from one computer to all Planetlab nodes, and then retrieving ads again to check whether profile data is stored locally or globally. Summary of Findings Both Google and AOL-Adsonar use behavioral targeting for the “sports” interest category, whereas Adblade does not. Both Google and AOL-Adsonar associate a user profile with interest categories only on an ad server close to a user. This work is supported by China Scholarship Council, NSF CAREER Award no , and Spanish MEC project TEC C03-02 (70% FEDER funds).