Download presentation
Presentation is loading. Please wait.
1
Data-Intensive Systems Michael Franklin UC Berkeley www.cs.berkeley.edu/~franklin
2
M. Franklin Visit Day 2001 Overview Three current database projects at UCB CS. –Telegraph – A highly-adaptive federated data management architecture. –Data Centers – Profile-driven data staging and dissemination. –GriPhyN – A global grid for large-scale physics and astronomy data.
3
M. Franklin Visit Day 2001 1) The Berkeley Telegraph Project Adaptive data management for Internet-scale dataflow processing. –Dataflow-based scheduling. –Cross-domain negotiation. –“User-in-the-loop” –Adaptation/learning over varying granularities individual long-running jobs many similar short jobs continuous data flows and filters. –Driving applications: federated, www-based processing sensor networks
4
M. Franklin Visit Day 2001 Telegraph – Wide Area Architecture Architecture is based on data flow. –adaptive routing on a per-tuple basis: between nodes in the wide area between “operators” on a given node –marriage of query systems, application-level routing and machine learning Each node uses cluster-based parallelism (“rivers”) for scalability and fault-tolerance Operator/service composition using “fjords”. –Fjords provide caching, queueing, and fault tolerance.
5
M. Franklin Visit Day 2001 Telegraph (status) First (single-node) prototype is operational –No XML support yet –Separate fjords prototype also operational First apps –Facts and Figures on the Web Campaign contributions demo at fff.cs.berkeley.edu traffic sensor demo –Architecture for introspective systems –Data Recharging Contacts: Joe Hellerstein, Mike Franklin
6
M. Franklin Visit Day 2001 2) Data Centers: Exploiting User Profiles A new NSF ITR project. Goal: explore benefits of User Profiles for: –automatic wide-area data management –customized “context-aware” data delivery Intelligent caching architecture collects, composes, and distributes data Two-way synchronization for multi-user/multi-device data
7
M. Franklin Visit Day 2001 User Profiles Specify user interests in the form of “queries”. Include information on priorities, for resource constrained environments (e.g., mobile). Specify user preferences for data sources, resolution, timeliness,… Can fold in PIM data to obtain “context”. Profiles are aggregated at “Information Broker” nodes to create a Content Delivery Network.
8
M. Franklin Visit Day 2001 The challenge is to efficiently and quickly match incoming XML documents against the potentially huge set of user profiles. XFilter: Dissemination of XML Data XML Conversion XML Documents Filter Engine User Profiles Users Filtered Data Data Sources
9
M. Franklin Visit Day 2001 GriPhyN Four major experiments that will generate Petabytes of information. A Global “Grid” network Data Management issues are paramount. –Caching, Replication, Deep Storage, etc. –Handling the store/recompute tradeoffs of virtual data. –Improving responsiveness through query relaxation.
10
M. Franklin Visit Day 2001 Issues/Challenges/Synergies GriPhyN sets WWW scalability on its head: –For WWW, scalability issues are number of users, sites, and data items. –GriPhyN has many orders of magnitude fewer users, sites, and data items, but the data volume and computational requirements are massive.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.