Proxy Servers 2 What Is a Proxy Server? Intermediary server between clients and the actual server Proxy processes request Proxy processes response Intranet.

Slides:



Advertisements
Similar presentations
Enabling Secure Internet Access with ISA Server
Advertisements

Reinventing using REST. Anything addressable by a URI is called a resource GET, PUT, POST, DELETE WebDAV (MOVE, LOCK)
CHAPTER 15 WEBPAGE OPTIMIZATION. LEARNING OBJECTIVES How to test your web-page performance How browser and server interactions impact performance What.
4.01 How Web Pages Work.
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 22 World Wide Web and HTTP.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 44 How Firewalls Work How Firewalls Work.
C.R.E.A.M. – Cache Rules Evidently Ambiguous, Misunderstood Presented by JD Nir Written by Jacob Thompson and Stephen Bono.
1 Configuring Internet- related services (April 22, 2015) © Abdou Illia, Spring 2015.
 The IP address and port combination at which the NetScaler appliance receives client requests for the associated web application.  A public endpoint.
Information Retrieval in Practice
1 Caching in HTTP Representation and Management of Data on the Internet.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Topics in this presentation: The Web and how it works Difference between Web pages and web sites Web browsers and Web servers HTML purpose and structure.
Hypertext Transport Protocol CS Dick Steflik.
 What is it ? What is it ?  URI,URN,URL URI,URN,URL  HTTP – methods HTTP – methods  HTTP Request Packets HTTP Request Packets  HTTP Request Headers.
Overview of Search Engines
 Proxy Servers are software that act as intermediaries between client and servers on the Internet.  They help users on private networks get information.
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
Web Cache. Introduction what is web cache?  Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
1 Enabling Secure Internet Access with ISA Server.
Web-based Document Management System By Group 3 Xinyi Dong Matthew Downs Joshua Ferguson Sriram Gopinath Sayan Kole.
1 CS 430: Information Discovery Lecture 15 Library Catalogs 3.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
1 Networks, advantages & types of What is a network? Two or more computers that are interconnected so they can exchange data, information & resources.
Lecturer: Ghadah Aldehim
THE POTENTIAL FOR EFFECTIVE WEB CONTENT CONTROL BASED ON CURRENT TECHNOLOGY Carolyn Watters and Michael Shepherd Web Information Filtering Lab Faculty.
Intranet, Extranet, Firewall. Intranet and Extranet.
Krerk Piromsopa. Web Caching Krerk Piromsopa. Department of Computer Engineering. Chulalongkorn University.
Web Caching: Replication on the World Wide Web Jonathan Bulava CSC8530 – Distributed Systems Dr. Paul Schragger.
Proxy Servers 2 What Is a Proxy Server? Intermediary server between clients and the actual server Proxy processes request Proxy processes response Intranet.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
World Wide Web Hypertext model Use of hypertext in World Wide Web (WWW) WWW client-server model Use of TCP/IP protocols in WWW.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
The Internet 8th Edition Tutorial 4 Searching the Web.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
1 Caching in HTTP Representation and Management of Data on the Internet.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Proxy Servers.
McLean HIGHER COMPUTER NETWORKING Lesson 14 Firewalls & Filtering Comparison of Internet content filtering methods: firewalls, Internet filtering.
The Intranet.
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Operating Systems Lesson 12. HTTP vs HTML HTML: hypertext markup language ◦ Definitions of tags that are added to Web documents to control their appearance.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Copyright © 2006, Infinite Campus, Inc. All rights reserved. User Security Administration.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Web Server.
Web Services. 2 Internet Collection of physically interconnected computers. Messages decomposed into packets. Packets transmitted from source to destination.
Winter 2001C.Watters1 Apache Proxy Notes. winter 2001C.Watters2 Proxy Intermediary between clients and the web Configure browser to go to the proxy Proxy.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
Session 11: Cookies, Sessions ans Security iNET Academy Open Source Web Development.
PROXY SERVER Kalyani Ravi. A proxy server is essentially an electronic gatekeeper, residing between an organization's internal network and the Internet,
winter 2001C.Watters1 Proxy Servers winter 2001C.Watters2 What is a Proxy Server? Intermediary server between clients and the actual server Proxy processes.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
The Internet and the WWW IT-IDT-5.1. History of the Internet How did the Internet originate? Goal: To function if part of network were disabled Became.
Information Retrieval in Practice
TMG Client Protection 6NPS – Session 7.
WWW and HTTP King Fahd University of Petroleum & Minerals
Web Development Web Servers.
Software Applications for end-users
E-commerce | WWW World Wide Web - Concepts
Hypertext Transport Protocol
E-commerce | WWW World Wide Web - Concepts
Web Caching? Web Caching:.
UNIT 15 Webpage Creator.
Internet Applications
Web Privacy Chapter 6 – pp 125 – /12/9 Y K Choi.
Internet.
Presentation transcript:

Proxy Servers

2 What Is a Proxy Server? Intermediary server between clients and the actual server Proxy processes request Proxy processes response Intranet proxy may restrict all outbound/inbound requests the intranet server

3 What Does a Proxy Server Do?  Between client and server  Receives the client request  Decides if request will go on to the server  May have cache & may respond from cache  Acts as the client with respect to the server  Uses one of it’s own IP addresses to get page from server

4 Usual Uses for Proxies Firewalls Employee web use control ( etc.) Web content filtering (kids) –Black lists (sites not allowed) –White lists (sites allowed) –Keyword filtering of page content

5 User Perspective  Proxy is invisible to the client  IP address of proxy is the one used or the browser is configured to go there  Speed up retrieval if using caching  Can implement profiles or personalization

6 Main Proxy Functions  Caching  Firewall  Filtering  Logging

7 Web Cache Proxy  Our concern is not with browser cache!  Store frequently used pages at proxy rather than request the server to find or create again  Why?  Reduce latency: faster to get from proxy & so makes the server seem more responsive  Reduce traffic: reduces traffic to actual server

8 Proxy Caches  Proxy cache serves hundreds/thousands of users  Corporate and intranets often use  Most popular requests are generated only once  Good news: Proxy cache hit rates often hit 50%  Bad news: Stale content (stock quotes)

9 How Does a Web Cache Work?  Set of rules in either or both  Proxy admin  HTTP header

10 Don’t Cache Rules  HTTP header  Cache-control: max-age=xxx, must-revalidate  Expires: date…  Last-modified: date…  Pragma: no-cache (doesn’t always work!)  Object is authenticated or secure  Fails proxy filter rules  URL  Meta data  MIME type  Contents

11 Getting From Cache  Use cache copy if it is fresh  Within date constraint  Used recently and modified date is not recent

12 2. Firewalls  Proxies for security protection  More on this later

13 3. Filtering at the Proxy 1.URL lists (black and white lists) 2.Meta data 3.Content filters

Filtering label base Web doc URL lists keywords ratingsURLs ratings URLs

15 The Problem: the Web  1 billion documents (April 2000)  Average query is 2 words (e.g., Sara name)  Continual growth  Balance global indexing and access and unintentional access to inappropriate material

16 Filtering Application Types Proxies  Black lists  White lists  Keyword profiles  Labels

17 Black and White Lists  Black list : URLs proxy will not access  White list: URLs proxy will allow access

18 How Is Filtering/selection Done?  Build a profile of preferences  Match input against the profile using rules

19 Black and White Lists  Black list of URLs  No access allowed  White list of URLs  Access permitted

20 Lists in Action  1 billion documents!  Who builds the lists  Who updates them  Frequency of updates

21 Labels  Metadata tags  Rule driven: PICS rules for example  Labels are part of document or separate  Separate = label bureau

22 Labels  Metadata (goes with page)  Label Bureau (stored separately from page)

23 Meta Data as part of HTML doc <META HTTP-EQUIV=“keywords” CONTENT=“federal”> <META HTTP-EQUIV=“keywords” CONTENT=“tax”> …… Browser and/or proxy interpret the metadata

24 Metadata Apart From Doc  Label bureaus  Request for a doc is also a request for labels from one or more label bureaus  Who makes the labels  Text analysis  Community of users  Creator of document

25 Labels: Collaborative Filtering Search Engine Web Site Label Bureau A Label Bureau B Rating Service Labels Author Labels

26 PICS and PICS Rules  Tools for communities to use profiles and control/direct access  Structure designed by W3 consortium  Content designed by communities of users

27 PICS Rating Data (PICS1-1 “http// by “John Doe” labels on “ ” until “ ” for ratings (violence 2 blood 1 language 4) )

28 Using a URL List Filtering (PicsRule-1.1 (Policy (RejectByURL ( Policy (AcceptIf “otherwise”) )

29 Using the PICS Data (PicsRule-1.1 (serviceinfo ( shortname “PTA” bureauURL UseEmbedded “N” ) Policy (RejectIf “((PTA.violence >3) or (PTA.language >2))”) Policy (AcceptIf “otherwise”) )

30 Example: Medical PICS labels  Su – UMLS vocab word:  Aud- audience: 1-patient, 3-para, 5-GP, etc.  Ty-information type: 5-scientist, 3-patient, 4-prod  C-country: 1-Can, 4-Afghan, etc.  Etc.  Ratings(su aud 3:5 Ty 3 C 1)

31 User Profiles for Labels  Rules for interpreting ratings  Based on  User preferences  User access privileges  Who keeps these  Who updates these  How fine is the granularity

32 Labels and Digital Signatures Labels can also be used to carry digital Signature and authority information

33 Example (''byKey'' ((''N'' ''aba ='') (''E'' ''abcdefghijklmnop=''))) (''on'' '' T22: '') (''SigCrypto'' ''aba =='')) (''Signature'' '' DSig-label/DSS-1_0'' (''ByName'' graz.ac.at'') (''on'' '' T22: '') (''SigCrypto'' ((''R'' ''aba '') (''S'' ''casdfkl3r489'')))))

34 Proxy level (hidden)

35 Text analysis of Page content  Proxy examines text of page before showing it  Generally keyword based  Profile of ‘black’ and/or ‘white’ keywords

36 Profiles for Text analysis  Keywords (+ weights sometimes)  ‘Reflect’ interest of user or user group  May be used to eliminate pages  ‘All but’  May be used to select pages  ‘Only those’

37 Keyword matching algorithms 1.Extract keywords 2.Eliminate ‘noisy’ words with stop list (1/3) 3.Stem (computer compute computation) 4.Match to profile 5.Evaluate ‘value’ of match 6.Check against a threshold for match 7.Show or throw!

38 Stop List (35%) thefor ofon andis towith inby aas bethis willare fromthat orat beenan waswere havehas it (27 words)

39 Matching Profile to Page Similarity? How many profile terms occur in doc? How often? How many docs does term occur in? How important is the term to the profile?

40 Cosine Similarity Measurement Profile terms weighted PW (0,1)  importance Document terms weighted TW (0,1) –frequency in doc –frequency in whole set Overall closeness of doc to profile  (all profile terms) [TW *PW]  (  (all profile terms) [TW 2 ]*[PW 2 ])

41 What works well?

42 What’s the problem? Site Labels Who does them? Are they authentic? Has the source changed? A billion docs? Black and White lists Ditto Text analysis of page contents Poor results