Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Capture and Re-Use

Similar presentations


Presentation on theme: "Information Capture and Re-Use"— Presentation transcript:

1 Information Capture and Re-Use
Joe Hellerstein

2 Scenario Ubiquitous computing is more than clients!
sensors and their data feeds are key smart dust (MEMS sensors) biomedical monitoring devices (MEMS sensors) every item of value records its use/misuse (disposable computing) tacit information from human behavior video from surveillance cameras, broadcasts, etc.

3 There’s a Data Flood Coming

4 There’s a Data Flood Coming
What does it look like? Never ends: interactivity required Big: data reduction/aggregation is key Unpredictable: this scale of devices and nets will not behave nicely Key Technologies: CONTROL: early answers and interactivity online aggregation for data reduction River/Eddy: massively parallel, adaptive dataflow

5 CONTROL Continuous Output and Navigation Technology with Refinement On Line
Data-intensive jobs are long-running. How to give early answers and interactivity? Statistical estimators, and their performance implications online query processing algs: ripple joins online interactivity over feeds: data “juggle” Appreciate interplay of massive data processing, stats, and UIs Challenges: apply to sequence data, scale up

6 River Q We built the world’s fastest sorting machine
On the “NOW”: 100 Sun workstations + SAN But it only beat the record under ideal conditions! River: performance adaptivity for data flows on clusters simplifies management and programming perfect for sensor-based streams Challenges: deploy over a wide area

7 Eddy Q How to order and reorder operators over time
key complement to River: adapt not only to the hardware, but to the processing rates Challenges: scale up, consider parallel scheduling

8 Telegraph: Putting it Together
Want to build next-gen global DB system. Capture and Re-Use Embodied in a vertical solution. Marriage of: CONTROL, River & Eddy OceanStore + optionally-Xactional storage that handle new hardware realities, scale Federation in the wide area via Negotiation/Economics Combinations of browse/query/mine at UI no magic bullet there! CONTROL is key.

9 Integration with other options
Use Oceanic Data Utility for distribution, caching, protection of streams Use negotiation architectures to connect federated and stored streams Be data-intensive backbone to diverse clients Be a scalable platform for tacit knowledge extraction Cooperation Tacit information as a feed Capture/merge classroom feeds Use UI design tools for device-independent, interactive stream-based apps

10 Plan for Success One Year
Implement River/Eddy over parallel cluster, deploy CONTROL modules Deploy data analysis apps over sequence data (MEMS/Web/Video) Three Year Integrate w/ wide area storage & processing Get data-intensive Endeavour apps running on architecture (e.g. tacit knowledge mining) Develop UI tools for interacting with never-ending streams


Download ppt "Information Capture and Re-Use"

Similar presentations


Ads by Google