Download presentation
Presentation is loading. Please wait.
1
Information Capture and Re-Use Joe Hellerstein
2
Scenario Ubiquitous computing is more than clients! –sensors and their data feeds are key –smart dust (MEMS sensors) –biomedical monitoring devices (MEMS sensors) –every item of value records its use/misuse (disposable computing) –tacit information from human behavior –video from surveillance cameras, broadcasts, etc.
3
There’s a Data Flood Coming
4
What does it look like? –Never ends: interactivity required –Big: data reduction/aggregation is key –Unpredictable: this scale of devices and nets will not behave nicely Key Technologies: –CONTROL: early answers and interactivity online aggregation for data reduction –River/Eddy: massively parallel, adaptive dataflow
5
CONTROL Continuous Output and Navigation Technology with Refinement On Line Data-intensive jobs are long-running. How to give early answers and interactivity? –Statistical estimators, and their performance implications –online query processing algs: ripple joins –online interactivity over feeds: data “juggle” Appreciate interplay of massive data processing, stats, and UIs Challenges: apply to sequence data, scale up
6
River We built the world’s fastest sorting machine –On the “NOW”: 100 Sun workstations + SAN –But it only beat the record under ideal conditions! River: performance adaptivity for data flows on clusters –simplifies management and programming –perfect for sensor-based streams Challenges: deploy over a wide area
7
Eddy How to order and reorder operators over time key complement to River: adapt not only to the hardware, but to the processing rates Challenges: scale up, consider parallel scheduling
8
Telegraph: Putting it Together Want to build next-gen global DB system. Capture and Re-Use Embodied in a vertical solution. Marriage of: –CONTROL, River & Eddy –OceanStore + optionally-Xactional storage that handle new hardware realities, scale –Federation in the wide area via Negotiation/Economics –Combinations of browse/query/mine at UI no magic bullet there! CONTROL is key.
9
Integration with other options Integration –Use Oceanic Data Utility for distribution, caching, protection of streams –Use negotiation architectures to connect federated and stored streams –Be data-intensive backbone to diverse clients –Be a scalable platform for tacit knowledge extraction Cooperation –Tacit information as a feed –Capture/merge classroom feeds –Use UI design tools for device-independent, interactive stream-based apps
10
Plan for Success One Year –Implement River/Eddy over parallel cluster, deploy CONTROL modules –Deploy data analysis apps over sequence data (MEMS/Web/Video) Three Year –Integrate w/ wide area storage & processing –Get data-intensive Endeavour apps running on architecture (e.g. tacit knowledge mining) –Develop UI tools for interacting with never- ending streams
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.