Cloak and Dagger. In a nutshell… Cloaking Cloaking in search engines Search engines’ response to cloaking Lifetime of cloaked search results Cloaked pages.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Cloak & Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, Geoffrey M. Voelker University of California, San Diego 1.
Search Engine Optimization (SEO) Guideline Powered by DonorCommunity TM DonorCommunity eLearning Series v1.2, February 2012 Search Engine Optimization.
SEO Best Practices with Web Content Management Brent Arrington, Services Developer, Hannon Hill Morgan Griffith, Marketing Director, Hannon Hill 2009 Cascade.
What is WEB SPAM Many slides from a lecture by Marc Najork, Microsoft: “Detecting Spam Web Pages”
All Things Search Attracting and understanding website visitors.
Presented by Li-Tal Mashiach Learning to Rank: A Machine Learning Approach to Static Ranking Algorithms for Large Data Sets Student Symposium.
WEB SCIENCE: SEARCHING THE WEB. Basic Terms Search engine Software that finds information on the Internet or World Wide Web Web crawler An automated program.
The Key to Search Engine Optimization Jonathan Hollingshead.
SEO Techniques Tech Talk 29 th August 2013 (By PEN Vannak)
WEB SPAM A By-Product Of The Search Engine Era Web Enhanced Information Management Aniruddha Dutta Department of Computer Science Columbia University.
Ch. 3 Research and Idea Generation. What’s Coming? Before You Start Needs Analysis/Assessment Interviewing Virtual Value Chain Analysis Web Research Brain.
WageIndicator SEO, December 10, 2008 Irene van Beveren Today: 0.Why SEO is important 1.Keyword Strategies 2.Title Tags 3.Internal Links 4.Duplicate Content.
1 SOCIAL BOOKMARKING 101. HIBA KHALID BILAL SAEED KHAN FARID ALIANI ASKARI HASAN SOCIAL BOOKMARKING.
SEO PLAN Presented By Mangesh Dolse. Lead Management Tool( Sample)
Increasing Website ROI through SEO and Analytics Dan Belhassen greatBIGnews.com Modern Earth Inc.
Search Engine Optimization
Search Engine Optimization (SEO) Week 07 Dynamic Web TCNJ Jean Chu.
Search Engine Optimization. Introduction SEO is a technique used to optimize a web site for search engines like Google, Yahoo, etc. It improves the volume.
TwitterSearch : A Comparison of Microblog Search and Web Search
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
Malware Hunter How To Guide for SecurityCenter Continuous View™
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 7-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Courtney Forsmann IT Help Desk Manager Lewis-Clark State College October 1, 2014.
Component 4: Introduction to Information and Computer Science Unit 2: Internet and the World Wide Web Lecture 2 This material was developed by Oregon Health.
Search Engine Optimization ext 304 media-connection.com The process affecting the visibility of a website across various search engines to.
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
Crawling Slides adapted from
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
The Business Model and Strategy of MBAA 609 R. Nakatsu.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
Stands for “Search Engine Optimization” Process of improving “visibility” of a web site to search engines in order to help search ranking Attracts more.
Search Engine Rank Placement (SERP) Search Engine Optimization (SEO) Search Engine Marketing (SEM) Search Engines & Webmaster Tools Automated Submissions.
Search engines are the key to finding specific information on the vast expanse of the World Wide Web. Without sophisticated search engines, it would be.
Lecture 4 Title: Search Engines By: Mr Hashem Alaidaros MKT 445.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
Basic Search Engine Optimization. What is SEO?  SEO is an abbreviation for search engine optimization.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Lecture 6 Title: Web Planning, Designing, Developing for E-Marketing By: Mr Hashem Alaidaros MKT 445.
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Search Engines By: Faruq Hasan.
Is Your E-commerce Site Losing Customers? Sharon Taylor.
Search Engine and SEO Presented by Yanni Li. Various Components of Search Engine.
DIGITAL MARKETING COMPETENCY TRAINING WEB | SMO | SEM | SEA | SEO | MOBILE.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
What is Web Information retrieval from web Search Engine Web Crawler Web crawler policies Conclusion How does a web crawler work Synchronization Algorithms.
Online Marketing. Types Marketing Link Building Content Marketing Search Engine Optimization(SEO) Social Media Marketing Advertising.
Week 1 Introduction to Search Engine Optimization.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
What is WEB SPAM Many slides are from a lecture by Marc Najork: “Detecting Spam Web Pages”
Week 5  SEO  CSS Please Visit: to download all the PowerPoint Slides for.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
KiloBytes Technologies “New Face Of Technology” / Website: SEOwww.kilobytes.inSEO.
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
DIGITAL MARKETING Strategies focused on increasing the reach and visibility for E-commerce Business.
Seminar on seminar on Presented By L.Nageswara Rao 09MA1A0546. Under the guidance of Ms.Y.Sushma(M.Tech) asst.prof.
Search Engine Optimization (SEO) Presentation By Celina Jonesi Small Business Seo – KG Tech.
SEO Tactics Search Engines Optimization is the best process which helps to improve your business in search engine mediums and social mediums such as Facebook,
Search Engine Marketing Science Writers Conference 2009.
CHAPTER 16 SEARCH ENGINE OPTIMIZATION. LEARNING OBJECTIVES How to monitor your site’s traffic What are the pros and cons of keyword advertising within.
How To Market Disaster Restoration Services in The Internet Era
WEB SPAM.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
DIGITAL MARKETING AGENCY Digital Marketing.
Presentation transcript:

Cloak and Dagger

In a nutshell… Cloaking Cloaking in search engines Search engines’ response to cloaking Lifetime of cloaked search results Cloaked pages in search results

Ubiquity of advertising on the Internet. Search, by and large, enjoys the primacy. Search Engine Optimisation – SEO – doctoring of search results. For benign ends such as simplifying page content, optimizing load times, etc. For malicious purposes such as manipulating page ranking algorithms.

Cloaking Conceals the true nature of a Web site Keyword Stuffing – Associating benign content to keywords Attracting traffic to scam pages Protecting the Web servers from being exposed Not scamming those who arrive at the site via different keywords.

Types of Cloaking Repeat Cloaking User Agent Cloaking Referrer Cloaking (sometimes also called “Click- through Cloaking”) IP Cloaking

DAGGER Dagger encompasses five different functions – Collection of search terms Querying search results generated search engines Crawling search results Detecting cloaking Repeating the above four processes to study variance in measurements

Collection of Search Terms Two different kinds of cloaked search terms are targeted: TYPE 1 : Search terms which contain popular words. Aimed at gathering high volumes of undifferentiated traffic. TYPE 2: Search terms which reflect highly targeted traffic Here cloaked content matches the cloaked search terms.

TYPE 1 : Use popular trending search terms Google Hot Searches and terms - shed light on search engine based data collection methods, respectively Alexa - client-based data collection methods Twitter terms clue us on social networking trends. Cloaked page entirely unrelated to the trending search terms TYPE 2: set of terms catering to a specific domain Content of the cloaked pages actually matches the search terms.

Querying Search Results Terms collected in the previous step are fed to the search engines Study the prevalence of cloaking across engines Examine their response to cloaking. Top 100 search results and accompanying metadata compiled into list “Known good” domains entries eliminated in order to false positives during data processing. Similar entries are grouped together with appropriate ‘count’.

Crawling Search Results Crawl the URL’s. Process the fetched pages Detect cloaking in parallel Helps minimize any possible time of day effects. Multiple crawls

Normal search user Googlebot Web crawler A user who does not click through the search result Detect pure user-agent cloaking without any checks on the referrer. 35% of cloaked search results for a single measurement perform pure user-agent cloaking. Pages that employ both user-agent and referrer cloaking are nearly always malicious. IP Cloaking - half of current cloaked search results do in fact employ IP cloaking via reverse DNS lookups.

Detecting Cloaking Process the crawled data using multiple iterative passes Various transformations and analyses are applied This helps compile the information needed to detect cloaking. Each pass uses a comparison based approach: Apply same transformations onto the views of the same URL, as seen from the user and the crawler Directly compare the result of the transformation using a scoring function Thresholding - detect pages that are actively cloaking and annotate them. Used for later analysis.

Temporal Re-measurement To study lifetime of cloaked pages. Temporal component in Dagger. Fetch search results from search engines Crawl and process URLs at later instances of time. Measure the rate at which search engines respond to cloaking Measure the duration pages are cloaked

Cloaking Over Time In trending searches the terms constantly change. Cloakers target many more search terms and a broad demographic of potential victims Pharmaceutical search terms are static Represent product searches in a very specific domain. Cloakers have much more time to perform SEO to raise the rank of their cloaked pages. This results in more cloaked pages in the top results.

Sources of Search Terms Blackhat SEO – artificially boost the rankings of cloaked pages. Search detect cloaking either directly (analyzing pages) or indirectly (updating the ranking algorithm). Augmenting popular search terms with suggestions. Enables targeting the same semantic topic as popular search terms. Cloaking in search results highly influenced by the search terms.

Search Engine Response Search engines try to identify and thwart cloaking. Cloaked pages do regularly appear in search results,. Many are removed or suppressed by the search engines within hours to a day. Cloaked search results rapidly begin to fall out of the top 100 within the first day, with a more gradual drop thereafter.

Cloaking Duration Cloakers manage their pages similarly independent of the search engine. Pages are cloaked for long durations: over 80% remain cloaked past seven days. Cloakers will want to maximize the time that they might benefits of cloaking by attracting customers to scam sites, or victims to malware sites. Difficult to recycle a cloaked page to reuse at a later time.

Cloaked Content Redirection of users through chain of advertising networks About half of the time a cloaked search result leads to some form of abuse. long-term SEO campaigns constantly change the search terms they are targeting and the hosts they are using.

Domain Infrastructure Key resource to effectively deploy cloaking in scam: Access to Web sites Access to domains For TYPE I terms, majority of cloaked search results are in.com. For TYPE II terms, cloakers use the “reputation” of pages to boost their ranking in search results

Search Engine Optimization Since a major motivation for cloaking is to attract user traffic, we can extrapolate SEO performance based on the search result positions the cloaked pages occupy. Cloaking the TYPE I terms target popular terms that are very dynamic, with limited time and heavy competition for performing SEO on those search terms. Cloaking TYPE II terms is a highly focused task on a static set of terms, Provides much longer time frames for performing SEO on cloaked pages for those terms.

Conclusion Cloaking has become a standard tool in the scammer’s toolbox Cloaking adds significant complexity for differentiating legitimate Web content from fraudulent pages. Majority of cloaked seaarch results remain high in rankings for 12 hours The pages themselves can persist far longer. Search engine providers will need to further reduce the lifetime of cloaked results to demonetize the underlying scam activity.