Download presentation
Presentation is loading. Please wait.
Published byEricka Jeffery Modified over 10 years ago
1
U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003
2
Peer-to-peer File-sharing Exploit storage capability of the edge Balance load Robustness to failure Weaknesses: Search and Communities
3
Search Problem Lack of structured metadata Filenames, Keyword matching Opaque identifiers Support for popular formats Ignoring structured metadata Implicit indicators Collaborative filtering
4
State of the Art: Search MetadataNapster, Kazaa, Limewire, JxtaSearch Query RoutingGnutella, Routing Indices, Limewire, Neurogrid CommunitiesJxtaSearch, Alpine, Associative P2P Search in DHTsPIER, FASD, Inverted Indices
5
Community Problem Not simple to create a community for sharing a new file format Current state Different protocols/apps (gnutella, fasttrack, jxtasearch) Inadequate metadata (filename matching, limited schemas) Ad-hoc attempts aimed at specific domains Scattered and isolated – there is no easy way to discover communities
6
State of the Art: Communities OpaqueNo existing rich metadata search, no way to add it LimitedRich metadata search for some formats but no way to support new formats ImplicitImplicit indicators are used to identify communities, no way to specify explicitly PartialUsers can explicitly form groups but each grouping is in the eye of the beholder UnsharedUsers can explicitly direct rich metadata queries to a community, but response format is not specified
7
Improving Search Standard metadata layer Explicit structured metadata All resources are XML files XML Schema used to describe format (e.g. MP3, design pattern)
8
Schema instantiates resource singleton gang of four when creating a new class… ensure a class only has… make the class itself responsible… http://example.com/singleton.jpg
9
Automated interface generation resource xml schema resource create form resource search form resource resource view instantiates xslt
10
resource xml schema resource create form resource search form resource resource view instantiates xsl
11
resource xml schema resource create form resource search form resource resource view instantiates xsl
12
Community Creation and Discovery: What is a Community? Concrete object with defined tuple of attributes Simplest form: (format, protocol, …) Known examples: (mp3, napster) (video, kazaa) Examples that don’t exist: (design patterns, gnutella) (p2p papers, jxtasearch) Tuple is specified as a XML file
13
Simplifying Community Creation designpatterns designpattern.xsd gnutella designpattern.stylesheet User-designed communities Compose schema to describe format Compose community XML file
14
Community as class mp3 mp3 community mp3 mp3 class
15
Metaclass analogy mp3 mp3 community mp3 mp3 class communityclass
16
Community discovery is File discovery MP3 community shares MP3 files Community community shares communities mp3 mp3 community communitycommunity
17
Simplifying Community Discovery A Community for Communities: The Root Community Communities are files shared in a real community Root Community includes schema for communities (format, protocol) = (community, centralized db)
18
Schema for Communities root community community.xsd central-db community.stylesheet The Root Community
19
What is U-P2P? A framework that breathes life into these ideas Explicit metadata search and creation for every Community Creation of Community tuples (format, protocol etc… ) Discovery of Community tuples
20
Design
21
Technologies Java Tomcat Servlet Container Java Server Pages (JSP) + Servlets XSLT (transforms), XPath (queries) Java components for XSLT, XPath (Xerces, Xalan) eXist XML Database Log4j (logging infrastructure), JUnit (unit testing)
22
Evaluation and Validation: Areas of Interest Publish and Search times as Community size increases Breaking down Publish and Search operations Community effect Multiple central servers
23
Publish
24
Search
25
Community Effect Average Publish Time Multiple communities 356 ms Single community485 ms
26
Multiple Central Servers
27
Publish with Multiple Servers ServerProcessorSpeedOS 1Pentium 41.8 GHzWindows 2000 2Pentium II250 MHzLinux (RH7) 3Celeron1 GHzWindows XP
28
Vs. Without Multiple Central Servers ServerAvg. time to publish a file (750 files published) S1455 ms S21355 ms S3645 ms S1, S2, S3 (load-balanced) 517 ms
29
Contributions Standard Metadata Layer All communities include support for explicit metadata search and creation User-designed Communities Users can easily share new formats with full support for metadata Community for Communities Prevents fragmented, isolated communities by providing metadata about communities and a standard method for discovering them Performance and Scalability Gains Communities can improve performance and scalability vs. systems where resources are undifferentiated
30
Future Work Performance improvements Protocol independence (adapters for Gnutella, Freenet, etc.) Community-aware Gnutella routing More Community parameters (security, authentication, etc.)
31
Future Work continued Trust metrics (to differentiate between communities, metadata quality) Community evolution Inheritance and multiple inheritance for Communities
32
U-P2P Publications A. Mukherjee, B. Esfandiari, N. Arthorne, “U-P2P: A Peer-to-peer System for Description and Discovery of Resource-sharing Communities”, ICDCS Workshops 2002: 701-705, July 2002. Neal Arthorne, Babak Esfandiari and Aloke Mukherjee, "U-P2P: A Peer-to-peer Framework for Universal Resource Sharing and Discovery”, Proceedings of Freenix track of Usenix 2003, 29-38, June 2003. http://u-p2p.sourceforge.net
33
Backup slides
34
WebAdapter: User Interaction Model
35
Repository Design
36
Repository Design: Resource IDs
37
Repository Design: XML Database Requirements Flexibility to store wide variety of formats Handle powerful queries over all metadata XML Database better suited than RDBMS Difficult to map fields to rows and columns Chose eXist XML database Open source Written in Java Support for XML:DB API
38
Network Adapter Design Abstract interface to Peer-to-peer Network Routing search requests, handling results, handle incoming search requests, etc. Only implemented Hybrid model (Napster model) All peers can act as client and/or server
39
Network Adapter: Protocol
40
Evaluation and Validation: Challenges Finding large XML collections Berkeley Drosophila Genome Project: genome annotations Other sources: DBLP (CS papers), EDGAR (SEC filings), GeneOntology (gene-related concepts) Transforming DTDs to XML Schema (DTDXS package) Automation XML-RPC interface for publish and search
41
Publish: Breakdown of Operations
42
Publish: Client Timings
43
Publish: Server Timings
44
Network Adapter: Protocol
45
Search: Breakdown of Operations
46
Search: Total vs. Server Timings
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.