Presentation is loading. Please wait.

Presentation is loading. Please wait.

COS 109 Monday November 23 Housekeeping –Lab 6 and Problem Set 7 due dates Lab 6 is due by midnight on Friday November 27 Problem Set 7 is due by 5 PM.

Similar presentations


Presentation on theme: "COS 109 Monday November 23 Housekeeping –Lab 6 and Problem Set 7 due dates Lab 6 is due by midnight on Friday November 27 Problem Set 7 is due by 5 PM."— Presentation transcript:

1 COS 109 Monday November 23 Housekeeping –Lab 6 and Problem Set 7 due dates Lab 6 is due by midnight on Friday November 27 Problem Set 7 is due by 5 PM on Monday November 30 –Because these deadlines have been extended, there will be no further extensions –Final exam – January 18 (Monday) at 7:30PM Today’s class –A few more words about the internet –The World Wide Web

2 Grades on Problem set 6 Average score 35.8; a few people did not complete the assignment

3 The geography of the internet In 2012, there were 903.9 million Internet hosts –USA 505M (498M in 2011) –Japan 64.5M –Brazil 26.6M –Italy 25.7M –China 20.6M –Germany 20.0M –… –Iraq26 –Guam23 –North Korea 8 –Chad 6 Source CIA Factbook

4 Internet Users WorldWide Internet Users (2014 Est.) –China626M –European Union398M –USA276.6M –India237.3M –Japan109.3M –Brazil108.2M –Russia 84.4M –Germany 70.3M –Nigeria 66.6M –Total WorldWide3.2B

5 The backbone of the internet http://upload.wikimedia.org/wikipedia/commons/d/d2/Internet_ma p_1024.jpghttp://upload.wikimedia.org/wikipedia/commons/d/d2/Internet_ma p_1024.jpg http://internet-map.net/

6 Lets register an internet domain http://www.directnic.com

7 Who manages this? Internet Corp. for Assigned Names and Numbers (ICANN) –Formed in October 1998, –non-profit, private-sector corporation –broad coalition of the Internet's business, technical, academic, and user communities. –recognized by the U.S. and other governments as the global consensus entity to coordinate the technical management of the Internet's domain name system, the allocation of IP address space, the assignment of protocol parameters, and the management of the root server system. –funded through the many registries and registrars that comprise the global domain name and Internet addressing systems. ICANN was formed in 1998. It is a not-for-profit public-benefit corporation with participants from all over the world dedicated to keeping the Internet secure, stable and interoperable. It promotes competition and develops policy on the Internet’s unique identifiers.* ICANN doesn’t control content on the Internet. It cannot stop spam and it doesn’t deal with access to the Internet. But through its coordination role of the Internet’s naming system, it does have an important impact on the expansion and evolution of the Internet.* * From http://www.icann.org/en/about/

8 What does ICANN govern DNS – domain name system –Relates names to numbers TLD – top level domains –Originally there were 7.com,.edu,.gov,.int,.mil, net,.org –200+ country code top level domains –1000+ gTLD (generic top level domains) –..academy,.accountant,.apartments,.biz,.black,.cool,.dad,.money,.ooo,.sucks,.vodka,.xxx,.zone – More are here More are here Management –One company (called a registry) is in charge of each TLD. –A large number of companies (called registrars) can sell (and manage) names within a TLD

9 How does ICANN govern Draws up contracts with each registry Runs an accreditation system for registrars Oversees IP addresses (through companies) Oversees root servers –Root servers are 13 addresses on the Internet where complete address tables can be found

10 What about the root servers? What do they do? –Ultimately resolve addresses With help from top level domains Cs.princeton.edu .edu TLD to find princeton  princeton.edu to find cs.princeton.edu –But things change slowly, so There are intermediate name servers which cache addresses Very few address queries actually come to a root server.

11 List of root servers HostnameIP AddressesManager a.root-servers.net198.41.0.4, 2001:503:ba3e::2:30VeriSign, Inc. b.root-servers.net192.228.79.201, 2001:500:84::bUniversity of Southern California (ISI) c.root-servers.net192.33.4.12, 2001:500:2::cCogent Communications d.root-servers.net199.7.91.13, 2001:500:2d::dUniversity of Maryland e.root-servers.net192.203.230.10NASA (Ames Research Center) f.root-servers.net192.5.5.241, 2001:500:2f::fInternet Systems Consortium, Inc. g.root-servers.net192.112.36.4US Department of Defense (NIC) h.root-servers.net128.63.2.53, 2001:500:1::803f:235US Army (Research Lab) i.root-servers.net192.36.148.17, 2001:7fe::53Netnod j.root-servers.net192.58.128.30, 2001:503:c27::2:30VeriSign, Inc. k.root-servers.net193.0.14.129, 2001:7fd::1RIPE NCC l.root-servers.net199.7.83.42, 2001:500:3::42ICANN m.root-servers.net202.12.27.33, 2001:dc3::35WIDE Project

12 Root servers Some are fixed in location (unicast) Others are distributed (anycast) –Queries are routed to the topologically closest of a group of receivers all identified by the same destination address. –So, a decentralized service is provided. –Anycase servers can be used to distribute the impact of a distributed denial of service (DDoS) atack and so reduce its impact.

13 And where are they? Details at http://www.root-servers.org/http://www.root-servers.org/

14 Peering points There are several hundred such points Largest is Deutscher Commercial Internet Exchange with 650+ members and a peak speed of 5000 Gbit/sec (average speed 3000 Gbit/sec) of connected capacity and an average thruput of 1061 Gbit/sec Quick Facts (100% up time since 1997)Deutscher Commercial Internet ExchangeQuick Facts

15 Summarizing internet Ideas packets versus circuits –different models (mail vs phone) names and addresses –what is a computer called, how to find it routing –how to get from here to there protocols and standards –Internet works because of IP as common mechanism higher level protocols all use IP specific hardware technologies carry IP packets layering –divide system into layers each of which provides services to next higher level while calling on service of next lower level –a way to organize and control complexity, hide details

16 Summarizing internet technical issues: privacy & security are hard –data passes through shared unregulated dispersed media and sites scattered over the whole world –it's hard to control access & protect information along the way –many network technologies (e.g., Ethernet, wireless) use broadcast encryption necessary to maintain privacy –many mechanisms are not robust against intentional misuse –it's easy to lie about who you are service guarantees are hard –no assurance of reliable delivery, let alone of bandwidth, delay or jitter some resources are running low –IPv4 addresses are pretty much all assigned –IPv6 (the next generation) uses 128-bit addresses acceptance growing, by necessity but it has handled exponential growth amazingly well

17

18 To summarize How the internet works And now that we’ve reached the end of the internet

19 Website of the day google trends

20 Moving above internet pipes -- information flows to apps

21 Higher level protocols SSH: secure login SMTP: mail transfer HTTP: hypertext transfer -> Web protocol layering: –a single protocol can't do everything –higher-level protocols build elaborate operations out of simpler ones –each layer uses only the services of the one directly below – and provides the services expected by the layer above –all communication is between peer levels: layer N destination receives exactly the object sent by layer N source connectionless packet delivery service reliable transport service application physical layer

22 Encapsulation each piece of data at one level is wrapped up with a header and sent as a packet at the next lower level lowest level is what moves across specific network data ether dataHTTPdata TCP data IP

23 One particular app – the (World Wide) Web a way to connect computers that provide information (servers) with computers that ask for it (clients like you and me) –uses the Internet, but it's not the same as the Internet URL (uniform resource locator, e.g., http://www.amazon.com) –a way to specify what information to find, and where HTTP (hypertext transfer protocol) –a way to request specific information from a server and get it back HTML (hyptertext markup language) –a language for describing information for display browser (Firefox, Safari, Internet Explorer, Opera, Chrome, …) –a program for making requests, and displaying results embellishments –pictures, sounds, movies,... –loadable software the set of everything this provides

24 Web history 1989: Tim Berners-Lee at CERN –a way to make physics literature and research results accessible on the Internet 1991: first software distributions Feb 1993: Mosaic browser –Marc Andreessen at NCSA (Univ of Illinois) Mar 1994: Netscape –first commercial browser technical evolution managed by World Wide Web Consortium –non-profit organization at MIT, Berners-Lee is director –official definition of HTML and other web specifications –see www.w3.org

25 HTTP: Hypertext transfer protocol What happens when you click on a URL? client opens TCP/IP connection to host, sends request GET /filename HTTP/1.0 server returns –header info –HTML since server returns the text, it can be created as needed –can contain encoded material of many different types (MIME) URL format service://hostname/filename?other_stuff filename?other_stuff part can encode –data values from client (forms) –request to run a program on server (cgi-bin) –anything else GET url HTML client server

26 Embellishments original design of HTTP just returns text to be displayed now includes pictures, sound, video,... –need helpers or plug-ins to display non-text content e.g., GIF, JPEG graphics; sound; movies forms filled in by user –need a program on the server to interpret the information (cgi-bin) cookies to remember information on client –HTTP is stateless: server doesn't saveanything from one request to next –cookies are a way to remember information at the client active content: download code to run on the client –Javascript –Java applets –plug-ins –ActiveX

27 Forms and CGI programs "common gateway interface" –standard way to request the server to run a program –using information provided by the client via a form if the target file on server is an executable program and it has the right properties and permissions –e.g., in /cgi-bin directory and executable then run it on server to produce HTML to send back to client –using the contents of the form as input –output depends on client request: created on the fly, not just a file CGI programs can be written in any programming language –Perl, Python, PHP, Java, Ruby, …

28 Example form in HTML (dpd.mycpanel2.princeton.edu/mailform.html) <form METHOD="post" ACTION="http://dpd.mycpanel2.princeton.edu/zcgi-bin/ mailform.cgi"> Your name: Your email: Please rate this page: Poor OK Good

29 Cookies HTTP is stateless: doesn't remember from one request to next cookies intended to deal with stateless nature of HTTP –remember preferences, manage "shopping cart", etc. cookie: one chunk of text sent by server to be stored on client –stored in browser while it is running (transient) –stored in client file system when browser terminates (persistent) when client reconnects to same domain, browser sends the cookie back to the server –sent back verbatim; nothing added –sent back only to the same domain that sent it originally –contains no information that didn't originate with the server in principle, pretty benign but heavily used to monitor browsing habits, for commercial purposes

30 Cookie crumbs fetch a page from xyz.com –it contains –this causes a page to be fetched from DoubleClick.com –which now knows your IP address and what page you were looking at DoubleClick sends back a suitable advertisement –with a cookie that identifies "you" at DoubleClick next time you fetch any page that contains a DoubleClick.com image –the last DoubleClick cookie is sent back to DoubleClick –the set of sites and images that you are viewing is used to - update the record of where you have been and what you have looked at - send back targeted advertising (and a new cookie)

31 Advertising marketplace advertising exchanges –Yahoo Right Media, Doubleclick Ad Exchange, Facebook Atlas... a person uses a browser to request a web page web page "publisher" notifies exchange that advertising space on that page is available –publishers are typically portals or entertainment and news sites –publisher provides information about the person: past online activity, viewing and shopping habits, geographic location, demographics probably not actual identity (?) advertisers bid on the ad space –amount depends on person's attributes and location, advertiser's budget, etc. winner's advertisement is inserted into the page elapsed time: 10-100 milliseconds this happens for multiple advertisements on one page


Download ppt "COS 109 Monday November 23 Housekeeping –Lab 6 and Problem Set 7 due dates Lab 6 is due by midnight on Friday November 27 Problem Set 7 is due by 5 PM."

Similar presentations


Ads by Google