Download presentation
1
Search Engine Optimization (SEO)
Most techniques = common sense … Search engine success = integration common sense Surface - search engines and search engine optimization Increase sales, traffic, and conversions
2
Agenda What is a Search Engine? Examples of popular Search Engines
Search Engines statistics Why is Search Engine marketing important? What is a SEO Algorithm? Steps to developing a good SEO strategy Ranking factors Basic tips for optimization My agenda for today will encompass the following high level topics and I have allocated 4-8 minutes for Q&A session. I will also list of resources if you are interested in exploring SEO and other internet marketing techniques such as marketing, PPC ads, in-line text, etc.
3
Examples popular Search Engines
Today the undisputed leader in search engine usage is Google. Yahoo Claims to have the largest index of all search engines ranking over 5 Billion pages. MSN is the leading third, they have done a tremendous job at catching up to Google and yahoo but they still have a long time to catch up…. This a true example of the first mover advantage I don’t usually talk about Ask Jeeves but I have been seeing an increased level of traffic from Ask.com and according to recent statistics, Ask is now representative of approximately 7-8% of all searches. Every search engine has a different algorithm and ranking…. Our main focus today will be to look at the common and basic elements that every search engine company uses to rank and index pages.
4
Today the undisputed leader in search engine usage is Google.
Yahoo Claims to have the largest index of all search engines ranking over 5 Billion pages. MSN is the leading third, they have done a tremendous job at catching up to Google and yahoo but they still have a long time to catch up…. This a true example of the first mover advantage I don’t usually talk about Ask Jeeves but I have been seeing an increased level of traffic from Ask.com and according to recent statistics, Ask is now representative of approximately 7-8% of all searches. Every search engine has a different algorithm and ranking…. Our main focus today will be to look at the common and basic elements that every search engine company uses to rank and index pages.
5
How Do Search Engines Work?
Mechanics of a typical search If you were not take anything away from this session today… and only understand how Spider or crawlers work you would have understood 40% of Search engines.
6
Results & ads returned ranked
7
Category of first result
8
Result for phrase query
9
How Do Search Engines Work?
Spider “crawls” the web to find new documents (web pages, other documents) typically by following hyperlinks from websites already in their database Search engines indexes the content (text, code) in these documents by adding it to their databases and then periodically updates this content Search engines search their own databases when a user enters in a search to find related documents (not searching web pages in real-time) Search engines rank the resulting documents using an algorithm (mathematical formula) by assigning various weights and ranking factors If you were not take anything away from this session today… and only understand how Spider or crawlers work you would have understood 40% of Search engines.
10
Search on the Web Corpus: The publicly accessible Web: static + dynamic Goal: Retrieve high quality results relevant to the user’s need (not docs!) Need Informational – want to learn about something Navigational – want to go to that page Transactional – want to do something (web-mediated) Access a service Downloads Shop Gray areas Find a good hub Exploratory search “see what’s there” Low hemoglobin United Airlines Tampere weather Mars surface images Nikon CoolPix Car rental Finland Abortion morality
11
Search Engines as Info Gatekeepers
Search engines are becoming the primary entry point for discovering web pages. Ranking of web pages influences which pages users will view. Exclusion of a site from search engines will cut off the site from its intended audience. The privacy policy of a search engine is important. The politics of search engines, argues that the Web is a public good and thus its resource should be distributed in accordance with public principles rather than market norms. In googlearchy: Search engines bias the traffic of users according to their page ranking strategies, and it has been argued that they create a vicious cycle that amplifies the dominance of established and already popular sites. This bias could lead to a dangerous monopoly of information."
12
100+ Billion Searches / Month
According to Nielsen/Net rating
13
Search Engine Wars The battle for domination of the web search space is heating up! The competition is good news for users! Crucial: advertising is combined with search results! What if one of the search engines will manage to dominate the space?
14
Synonymous with the dot-com boom, probably the best known brand on the web.
Started off as a web directory service in 1994, acquired leading search engine technology in 2003. Has very strong advertising and e-commerce partners Acquired Inktomi in 2003 ( a search engine) and search marketing Overture (which had acquired alltheweb and altavista in 2003. At that time dropped its collaboration with google. Yahoo!
15
Lycos! One of the pioneers of the field
Introduced innovations that inspired the creation of Google Introduced the concept of popularity of web sites.
16
Verb “google” has become synonymous with searching for information on the web.
Has raised the bar on search quality Has been the most popular search engine in the last few years. Had a very successful IPO in August 2004. Is innovative and dynamic. Google
17
Live Search (was: MSN Search)
Synonymous with PC software. Remember its victory in the browser wars with Netscape. Developed its own search engine technology only recently, officially launched in Feb May link web search into its next version of Windows. Used Yahoo’s inktomi nd overture until 2005.
18
More (relevant) traffic + Good Conversions Rate = More Sales/Leads
Important? 80% of consumers find your website by first writing a query into a box on a search engine (Google, Yahoo, Bing) 90% choose a site listed on the first page 85% of all traffic on the internet is referred to by search engines The top three organic positions receive 59% percent of user clicks. Cost-effective advertising Clear and measurable ROI Operates under this assumption: More (relevant) traffic + Good Conversions Rate = More Sales/Leads
19
Experiment with query syntax
Default is AND, e.g. “computer chess” normally interpreted as “computer AND chess”, i.e. both keywords must be present in all hits. “+chess” in a query means the user insists that “chess” be present in all hits. “computer OR chess” means either keywords must be present in all hits. “”computer chess”” means that the phrase “computer chess” must be present in all hits.
20
The most popular search keywords
AltaVista (1998) AlltheWeb (2002) Excite (2001) sex free applet porno download pictures mp3 software new chat uk nude
21
Free Keyword Research Tools
00&__u= &__o=te&ideaRequestType=KEYWORD_IDE AS#search.none Keyword Tool and Traffic Estimator to identify competitive phrases and search frequencies Compare search patterns across specific regions, categories, time frames and properties
22
Web search Users Ill-defined queries Wide variance in
Short length Imprecise terms Sub-optimal syntax (80% queries without operator) Low effort in defining queries Wide variance in Needs Expectations Knowledge Bandwidth Specific behavior 85% look over one result screen only mostly above the fold 78% of queries are not modified 1 query/session Follow links – “the scent of information” ...
23
How far do people look for results?
24
Architecture of a Search Engine
User The Web Web spider Indexer Search Indexes Ad indexes
25
Web Crawling
26
A: Because all of those pages have been crawled
Q: How does a search engine know that all these pages contain the query terms? A: Because all of those pages have been crawled
27
Crawling picture URLs crawled and parsed Unseen Web URLs frontier Seed
Sec. 20.2 Crawling picture Web URLs frontier URLs crawled and parsed Unseen Web Seed pages
28
Motivation for crawlers
Support universal search engines (Google, Yahoo, MSN/Windows Live, Ask, etc.) Vertical (specialized) search engines, e.g. news, shopping, papers, recipes, reviews, etc. Business intelligence: keep track of potential competitors, partners Monitor Web sites of interest Evil: harvest s for spamming, phishing… … Can you think of some others?…
29
A crawler within a search engine
Web googlebot Page repository Text & link analysis Query hits Text index PageRank Ranker
30
One taxonomy of crawlers
Many other criteria could be used: Incremental, Interactive, Concurrent, Etc.
31
Basic crawlers This is a sequential crawler
Seeds can be any list of starting URLs Order of page visits is determined by frontier data structure Stop criterion can be anything
32
Graph traversal (BFS or DFS?)
Breadth First Search Implemented with QUEUE (FIFO) Finds pages along shortest paths If we start with “good” pages, this keeps us close; maybe other good stuff… Depth First Search Implemented with STACK (LIFO) Wander away (“lost in cyberspace”)
33
Universal crawlers Support universal search engines Large-scale
Huge cost (network bandwidth) of crawl is amortized over many queries from users Incremental updates to existing index and other data repositories
34
Large-scale universal crawlers
Two major issues: Performance Need to scale up to billions of pages Policy Need to trade-off coverage, freshness, and bias (e.g. toward “important” pages)
35
Large-scale crawlers: scalability
Need to minimize overhead of DNS lookups Need to optimize utilization of network bandwidth and disk throughput (I/O is bottleneck) Use asynchronous sockets Multi-processing or multi-threading do not scale up to billions of pages Non-blocking: hundreds of network connections open simultaneously Polling socket to monitor completion of network transfers
36
Universal crawlers: Policy
Coverage New pages get added all the time Can the crawler find every page? Freshness Pages change over time, get removed, etc. How frequently can a crawler revisit ? Trade-off! Focus on most “important” pages (crawler bias)? “Importance” is subjective
37
Web coverage by search engine crawlers
This assumes we know the size of the entire the Web. Do we? Can you define “the size of the Web”?
38
Maintaining a “fresh” collection
Universal crawlers are never “done” High variance in rate and amount of page changes HTTP headers are notoriously unreliable Last-modified Expires Solution Estimate the probability that a previously visited page has changed in the meanwhile Prioritize by this probability estimate
39
Do we need to crawl the entire Web?
If we cover too much, it will get stale There is an abundance of pages in the Web For PageRank, pages with very low prestige are largely useless What is the goal? General search engines: pages with high prestige News portals: pages that change often Vertical portals: pages on some topic What are appropriate priority measures in these cases? Approximations?
40
Complications Web crawling isn’t feasible with one machine
Sec Web crawling isn’t feasible with one machine All of the above steps distributed Malicious pages Spam pages Spider traps – incl dynamically generated Even non-malicious pages pose challenges Latency/bandwidth to remote servers vary Webmasters’ stipulations How “deep” should you crawl a site’s URL hierarchy? Site mirrors and duplicate pages Politeness – don’t hit a server too often
41
your guide for the search engines
ROBOT.TXT your guide for the search engines
42
What is robots.txt? It’s a file in the root of your website that can either allow or restrict search engine robots from crawling pages on your website.
43
How does it work? Before a search engine robot crawls your website, it will first look for your robots.txt file to find out where you want them to go. There are 3 things you should keep in mind: Robots can ignore your robots.txt. Malware robots scanning the web for security vulnerabilities, or address harvesters used by spammers, will not care about your instructions. The robots.txt file is public. Anyone can see what areas of your website you don’t want robots to see. Search engines can still index (but not crawl) a page you’ve disallowed, if it’s linked to from another website. In the search results it’ll then only show the url, but usually no title or information snippet. Instead, make use of the robots meta tag for that page.
44
What to put in your robots.txt file
User-agent: This is the line where you define which robot you’re talking to. It’s like saying hello to the robot: User-agent: * (Googlebot - Google, Slurp – Yahoo) Disallow: This tells the robots what you don’t want them to crawl on your site: Disallow: / (do not crawl anything on my site) /images/ Allow This tells the robots what you want them to crawl on your site. Allow: /
45
What to put in your robots.txt file
(Asterisk / wildcard *) With the * symbol, you tell the robots to match any number of any characters. Very useful for example when you don’t want your internal search result pages to be indexed. Disallow: *contact* (do not crawl any urls containing the word contact) $ (Dollar sign / ends with) The dollar sign tells the robots that it is the end of the url. Disallow: *.pdf$ # (Hash / comme You can add comments after the “#” symbol, either at the start of a line or after a directive.
46
What to put in your robots.txt file
Crawl-Delay This directive asks the robot to wait a certain amount of seconds after each time it’s crawled a page on your website.. Crawl-delay: 5 Request-rate: Here you tell the robot how many pages you want it to crawl within a certain amount of seconds. The first number is pages, and the second number is seconds. Request-rate: 1/5 # load 1 page per 5 seconds Visit-time: It’s like opening hours, i.e. when you want the robots to visit your website. This can be useful if you don’t want the robots to visit your website during busy hours (when you have lots of human visitors). Visit-time: # only visit between 21:00 (9PM) and 05:00 (5AM) UTC (GMT)
47
Test your page
48
Search engine optimization
SEO Search engine optimization
49
What is SEO? SEO = Search Engine Optimization
Refers to the process of “optimizing” both the on-page and off-page ranking factors in order to achieve high search engine rankings for targeted search terms. Refers to the “industry” that has been created regarding using keyword searching a a means of increasing relevant traffic to a website
51
What is a SEO Algorithm? Top Secret! Only select employees of a search engines company know for certain Reverse engineering, research and experiments gives SEOs (search engine optimization professionals) a “pretty good” idea of the major factors and approximate weight assignments The SEO algorithm is constantly changed, tweaked & updated Websites and documents being searched are also constantly changing Varies by Search Engine – some give more weight to on-page factors, some to link popularity
53
A good SEO strategy: Research desirable keywords and search phrases (WordTracker, Overture, Google AdWords) Identify search phrases to target (should be relevant to business/market, obtainable and profitable) “Clean” and optimize a website’s HTML code for appropriate keyword density, title tag optimization, internal linking structure, headings and subheadings, etc. Help in writing copy to appeal to both search engines and actual website visitors Study competitors (competing websites) and search engines Implement a quality link building campaign Add Quality content Constant monitoring of rankings for targeted search terms
54
Ranking factors On-Page Factors (Code & Content) Off-Page Factors
#3 - Title tags <title> #5 - Header tags <h1> #4 - ALT image tags #1 - Content, Content, Content (Body text) <body> #6 - Hyperlink text #2 - Keyword frequency & density Off-Page Factors #1 Anchor text #2 - Link Popularity (“votes” for your site) – adds credibility Anchor text is the visible hyperlinked text on the page. anchor text is usually used to indicate the subject matter of the page that it links to. For example, the text “7th world congress ebusiness" indicates to visitors that they can expect to see content about conference pertaining to ebusiness if they visit the link. This pattern of usage has been applied in search engine algorithms to enhance the relevance of the "target" or the "landing page" URL for the keywords appearing within the anchor text.
55
What a Search Engine Sees
View > Source (HTML code)
56
Pay Per Click PPC ads appear as “sponsored listings”
Companies bid on price they are willing to pay “per click” Typically have very good tracking tools and statistics Ability to control ad text Can set budgets and spending limits Google AdWords and Overture are the two leaders
57
PPC vs. “Organic” SEO Pay-Per-Click “Organic” SEO results in 1-2 days
easier for a novice or one little knowledge of SEO ability to turn on and off at any moment generally more costly per visitor and per conversion fewer impressions and exposure easier to compete in highly competitive market space (but it will cost you) Ability to generate exposure on related sites (AdSense) ability to target “local” markets better for short-term and high-margin campaigns results take 2 weeks to 4 months requires ongoing learning and experience to achieve results very difficult to control flow of traffic generally more cost-effective, does not penalize for more traffic SERPs are more popular than sponsored ads very difficult to compete in highly competitive market space ability to generate exposure on related websites and directories more difficult to target local markets better for long-term and lower margin campaigns
58
Keys to Successful SEO Strategy
1. Do not underestimate the importance of keyword research 2. Be sure to include the proper tags in your page coding 3. You must have optimized content! (3-5 uses of keyword per 250 words) 4. Use content marketing
59
Marketing/Brand Relevance
Keyword Selection How much competition (large, authority sites) is there for the particular keyword? How closely does the keyword match your product/service offering, messaging, goals and objectives? Marketing/Brand Relevance Optimization Opportunity Recommended Keywords Competition Search Frequency Is there already a logical place on the site to optimize for the particular keyword? How many people are searching on the particular keyword?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.