XQuery and Hierarchical Naming Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 7, 2008.

Slides:



Advertisements
Similar presentations
© 2014 A. Haeberlen, Z. Ives CIS 455/555: Internet and Web Systems 1 University of Pennsylvania Indexing February 5, 2014.
Advertisements

Domain Name System. DNS is a client/server protocol which provides Name to IP Address Resolution.
CS 6401 The Domain Name System Outline Domain Name System.
DNS Domain Name System. Domain names and IP addresses People prefer to use easy-to-remember names instead of IP addresses Domain names are alphanumeric.
Querying XML Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 6, 2003 Some slide content courtesy of Susan Davidson.
Name Services Jessie Crane CPSC 550. History ARPAnet – experimental computer network (late 1960s) hosts.txt – a file that contained all the information.
COS 420 DAY 23. Agenda Assignment 4 Corrected 2 B’s Assignment 5 posted Chap Due May 4 Final exam will be take home and handed out May 4 and Due.
Querying XML: XQuery and XSLT Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 21, 2004 Some slide content courtesy.
1 COS 425: Database and Information Management Systems XML and information exchange.
Application Layer At long last we can ask the question - how does the user interface with the network?
Distributed Systems CS Naming – Part II Lecture 6, Sep 26, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
XML Querying and Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 1, 2005 Some slide content courtesy.
XML Querying and Views Helena Galhardas DEI IST (slides baseados na disciplina CIS 550 – Database & Information Systems, Univ. Pennsylvania, Zachary Ives)CIS.
1 DNS,NFS & RPC Rizwan Rehman, CCS, DU. Netprog: DNS and name lookups 2 Hostnames IP Addresses are great for computers –IP address includes information.
DNS. Outline r Domain Name System r DNS Hierarchy r Resolution.
A centralized system.  Active Directory is Microsoft's trademarked directory service, an integral part of the Windows architecture. Like other directory.
NET0183 Networks and Communications Lecture 25 DNS Domain Name System 8/25/20091 NET0183 Networks and Communications by Dr Andy Brooks.
Ch-9: NAME SERVICES By Srinivasa R. Gudipati. To be discussed.. Fundamentals of Naming Services Naming Resolution The Domain Name System (DNS) Directory.
DNS Domain Name System. Domain names and IP addresses People prefer to use easy-to-remember names instead of IP addresses Domain names are alphanumeric.
Internet applications Bill Chu. © Bei-Tseng Chu Aug 2000 Need for Domain Name Service (DNS) Natively, a TCP host is identified by its IP address hosts.
Chapter 4 Networking and the Internet Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Arthur Harris Gennadiy Kofman James Mendoza Domain Name System.
Introduction To OpenLDAP Directory Services. What is a Directory Service? A specialized database optimized for reading, browsing, and searching. No complicated.
HTTP, Naming and Lookup Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems September 15, 2015.
Example applications Symbolic names and the Domain Name System (DNS)
How Web Servers and the Internet Work by by: Marshall Brainby: Marshall Brain
Network Installation. Internet & Intranets Topics to be discussed Internet. Intranet. .
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Chapter 29 Domain Name System (DNS) Allows users to reference computer names via symbolic names translates symbolic host names into associated IP addresses.
October 8, 2015 University of Tulsa - Center for Information Security Microsoft Windows 2000 DNS October 8, 2015.
Querying XML – Concluded Introduction to Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 9, 2003 Some.
1 Design Issues in XML Databases Ref: Designing XML Databases by Mark Graves.
1 Kyung Hee University Chapter 18 Domain Name System.
Database Systems Part VII: XML Querying Software School of Hunan University
Domain Name System Refs: Chapter 9 RFC 1034 RFC 1035.
XML Schemas, XPath, and XQuery Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 19, 2004 Some slide content.
Finding What We Want: DNS and XPath-Based Pub-Sub Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 12, 2008.
Introduction to Active Directory
Web Server Administration Chapter 4 Name Resolution.
Attribute based Naming
© 2016 A. Haeberlen, Z. Ives CIS 455/555: Internet and Web Systems 1 University of Pennsylvania Naming and Lookup; LDAP and DNS(SEC) January 27, 2016.
1. Internet hosts:  IP address (32 bit) - used for addressing datagrams  “name”, e.g., ww.yahoo.com - used by humans DNS: provides translation between.
COMP 431 Internet Services & Protocols
CEG 2400 Fall 2012 Directory Services Active Directory Tree Domain.
Querying XML, Part II Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 5, 2008.
Domain Name System INTRODUCTION to Eng. Yasser Al-eimad
Basics of the Domain Name System (DNS) By : AMMY- DRISS Mohamed Amine KADDARI Zakaria MAHMOUDI Soufiane Oujda Med I University National College of Applied.
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
The Internet Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
MAN-IN-THE-MIDDLE ATTACK STEGANOGRAPHY Lab# MAC Addresses and ARP  32-bit IP address:  network-layer address  used to get datagram to destination.
IS1500: Introduction to Web Development
Chapter 9: Domain Name Servers
Querying and Transforming XML Data
Domain Name System (DNS)
Introduction to LDAP Frank A. Kuse.
E-commerce | WWW World Wide Web - Concepts
E-commerce | WWW World Wide Web - Concepts
Software Design and Architecture
ICT Communications Lesson 1: Using the Internet and the World Wide Web
Net 323 D: Networks Protocols
EE 122: Domain Name Server (DNS)
Application layer Lecture 7.
SQL Server 2000: Integration with AD and E2K
Introduction to Name and Directory Services
EGEE Middleware: gLite Information Systems (IS)
Querying XML: XQuery and XSLT
Domain Name System Refs: Chapter 9 RFC 1034 RFC 1035.
Computer Networks Primary, Secondary and Root Servers
Presentation transcript:

XQuery and Hierarchical Naming Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems February 7, 2008

2 Today  Reminder: Homework 1 due 11:59PM  XQuery and joins  Addressing vs. naming  Hierarchical names

3 XQuery’s Basic Form  The model: bind nodes (or node sets) to variables; operate over each legal combination of bindings; produce a set of nodes  “FLWOR” statement pattern: for {iterators that bind variables} let {collections} where {conditions} order by {order-conditions} return {output constructor}

4 Example XML Data Root ?xml dblp mastersthesis inproceedings mdate key authortitleyear school authortitle year crossref ee mdate key 2002… ms/Brown92 Kurt Brown PRPL… 1992 wisc conf/sigm../ Paul R. On… sigmod www… university name key wisc Wisconsin country USA

5 XQuery and Joins for $i in doc (“dblp.xml”)/dblp/inproceedings, $r in $i/crossref/text(), $c in doc (“dblp.xml”)/dblp/conf, $n in where $c = $r return { $i, $c }

6 Some Uses for Join in XML  Translation between values  SSN  PennID  Joining or combining information  Amazon invoice info + UPS tracking info  Restructuring information  …..  … …  Here, we separate authors from books, then join them back in “upside-down” fashion

7 Changing Nesting of XML Content Re-nesting XML trees is a common operation Simply nest the query blocks and correlate them – similar to join for $u in doc(“dblp.xml”)/dblp/university, $n = $u/name/text(), $k = where $u/country = “USA” return { $n } { for $mt in $u/../mastersthesis, $inst in $mt/school/text() where $mt/year/text() = “1992” and _______________ return $mt/title }

8 Collections & Aggregation in XQuery  Given a collection, we can compute an average, count, etc. of its members: { for $paper in doc(“dblp.xml”)/dblp/inproceedings let $pauth := $paper/author return { $paper/title } { fn:count($pauth) } } a collection

9 Sorting in XQuery  We can order the sequence of “result tuples” output by the return clause: for $x in doc(“dblp.xml”)/proceedings order by $x/title/text() return $x

10 Querying & Defining Tags  Can get a node’s name by querying node-name(): for $x in document(“dblp.xml”)/dblp/* return node-name($x)  Can construct elements and attributes using computed names: for $x in document(“dblp.xml”)/dblp/*, $year in $x/year, $title in $x/title/text(), element { node-name($x) } { attribute {“year-” + $year} { $title } }

11 XQuery Summary  Very flexible and powerful language for XML  Focus is on database-style operations like joins  Performs tasks that can’t be done with XPath or XSLT and that are tedious to program in Java:  Integrating information from multiple sources  Joins, based on correspondences of values  Computing count, average, etc.  Today, XQuery is available:  In RDBMSs (SQL Server, Oracle, DB2) and XML DBMS systems (MarkLogic)  As the basis of research prototypes for “XQuery full text”  As the basis of “XQueryP” – a Web Services/AJAX programming language based on XQuery but with programming language features   We will discuss data integration and middleware later in the course

12 Hierarchical Naming Schemes Thus far, we’ve seen XPath as a hierarchical naming scheme  “Content-based naming”: describe the structure and values of a tree structure  Assumption: XML tree resides in (or is being sent to) one place But hierarchy is often used for naming and location

13 How Do We Find Things on the Internet? Generally, using one of three means:  Addresses or locations: specify where something is, assuming that we understand how to navigate  Just like a physical address, we may still need a map!  In the Internet, addresses are typically IP addresses – the routers know the map  Names: are mapped into addresses via lookup services  Best-known example on the Internet: DNS name  Cell phone numbers, addresses, etc. are becoming names  Content-based addressing/naming  The actual data value is somehow used to find its location  The basis of publish-subscribe systems and peer-to-peer architectures

14 The Simplest Way of Going from Names or Content  Locations  Directory-based lookup protocols are very common  Examples:  Napster 1.0 – peer-to-peer storage with central directory  Inverted index – used to look up keywords in information retrieval  DNS – distributed hierarchical directory  LDAP – hierarchical Directory Information Tree

15 Napster 1.0, ca 2002  Hybrid of peer-to-peer storage with central directory showing what’s currently available  What are the trade-offs implicit in this model? Why did it fail? Napster.com Peer1 Peer2 Peer3 jjackson-lame.mp3 bspears-oops.mp3 jjackson-lame.mp3 jjackson-lame bspears-oops Directory

Other Services with Similar Directory + Peer Architectures  FolderSync – now owned by Microsoft  Google Desktop Search with multiple machines  BitTorrent trackers are quite similar (we’ll discuss BitTorrent more later) 16

17 Inverted Indices  A “forward index”: documents to words  The “inverted index”: words to word-occurrences  The basis of most information retrieval engines, Google, etc.  Can handle positional predicates  … But how can we reconstruct previews?

18 Naming People and Devices: LDAP  Lightweight Directory Access Protocol  Hierarchical naming system that can be partitioned and replicated

19 LDAP’s Schema LDAP information has an XML-like schema:  A unique name in LDAP is called a Distinguished Name, “dn” and consists of a sequence of attributes representing a hierarchy, from most-specific to least-specific (as in DNS names):  o = organization; dc = domain component  ou = organizational unit  uid = user ID  cn = common name  c = country; st = state; l = locality  Can also have objectClass – the type of entity

20 LDAP Hierarchy Brad Marshall LDAP Tutorial, quark.humbug.au/publications/ldap_tut.html

21 Querying LDAP LDAP queries are mostly attribute-value predicates:  uid=zives; o=upenn; c = usa  (|(cn=Susan Davidson)(cn=Zachary Ives)(cn=Val Tannen))  objectclass=posixAccount  (!cn=Val Tannen) How does this differ from XPath? How might we process these queries?

22 The Backbone of Internet Naming: Domain Name Service  A simple, hierarchical name system with a distributed database – each domain controls its own names edu columbia upenn berkeley com wwwcissas www amazon www … … … … … … … … Top Level Domains

23 Top-Level Domains (TLDs) Mostly controlled by Network Solutions, Inc. today .com: commercial .edu: educational institution .gov: US government .mil: US military .net: networks and ISPs (now also a number of other things) .org: other organizations  244, 2-letter country suffixes, e.g.,.us,.uk,.cz,.tv, …  and a bunch of new suffixes that are not very common, e.g.,.biz,.name,.pro, …

24 Finding the Root 13 “root servers” store entries for all top level domains (TLDs) DNS servers have a hard-coded mapping to root servers so they can “get started”

25 Excerpt from DNS Root Server Entries This file is made available by InterNIC registration services under anonymous FTP as ; file /domain/named.root ; ; formerly NS.INTERNIC.NET ; IN NS A.ROOT-SERVERS.NET. A.ROOT-SERVERS.NET A ; ; formerly NS1.ISI.EDU ; NS B.ROOT-SERVERS.NET. B.ROOT-SERVERS.NET A ; ; formerly C.PSI.NET ; NS C.ROOT-SERVERS.NET. C.ROOT-SERVERS.NET A (13 servers in total, A through M)

26 Supposing We Were to Build DNS  How would we start? How is a lookup performed? (Hint: what do you need to specify when you add a client to a network that doesn’t do DHCP?)

27 Issues in DNS  We know that everyone wants to be “my- domain”.com  How does this mesh with the assumptions inherent in our hierarchical naming system?  What happens if things move frequently?  What happens if we want to provide different behavior to different requestors (e.g., Akamai)?

28 Next Time…  We’ll look at alternative mechanisms for finding things:  Publish-subscribe models  Gossip protocols, such as in routers  Flooding  … and soon, peer-to-peer or content-based routing