Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Data Management Raghu Ramakrishnan. - 2 - Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy.

Similar presentations


Presentation on theme: "Web Data Management Raghu Ramakrishnan. - 2 - Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy."— Presentation transcript:

1 Web Data Management Raghu Ramakrishnan

2 - 2 - Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy Massively distributed Fine-grained permissions, hierarchical acls RDBMSs were a lousy fit

3 - 3 - Research Cloud Computing: Computing as a Service Cloud Computing CPU Intensive Data Intensive Analytic E.g., SSDS, Hadoop Packaged Software High-throughput E.g., Condor “Transactional” Storage & Serving E.g., PNUTS, S3, SSDS, UDB

4 - 4 - Research Implications Data management as a service –Scientists and others who’ve resisted (installing, maintaining, and) using DBMSs will find it much easier to reap the benefits –“Data centers” and “Computing Centers” will come into vogue again Hosted back-ends and RAD tools will make Web application development accessible to all –The Web is becoming open E.g., OpenSocial, OpenID Ideas will be the most valuable currency, not the wherewithal to build complex systems Paradigm shifts possible for how we do research in many fields –Build applications that embed your algorithms and test them directly in the field— Computer Scientists can interact directly with users (ironically, this would still be a breakthrough of sorts after four decades!) –Many other disciplines (e.g., Sociology, microeconomics) can design and conduct online experiments involving unprecedented numbers of participants

5 - 5 - Research PNUTS: DB in the Cloud E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Indexes and views Structured, flexible schema Hosted, managed infrastructure

6 - 6 - Research Basic Consistency Model Goal: Make it easier for applications to reason about updates and cope with asynchrony—alternative to “transactions” in an asynchronous world What happens to a record with primary key “Brian”? Guarantees: Every reader will always see some consistent, but possibly stale version Readers can request a more up-to-date version, but may pay extra latency –Special case: Critical read (writer/readers see their own writes) Writers can verify that the record is still at the version they expect Time Record inserted Update Delete v. 1 v. 2 v. 3 Generation 1 Record inserted Update Delete v. 1 v. 2 v. 4 Generation 2 Update v. 3 Record inserted Delete v. 1 Generation 3

7 - 7 - Research Lots of Issues to Re-think Massive distribution & replication –Asynchrony –Availability –Consistency DBA to the world –Auto-tuning –Multi-tenancy –Access control (granularity, online ids) –Encryption App-support –Caching

8 - 8 - Research Querying the Web Search will become more semantic—best-effort match-making between: –Query intent (NLP, query logs …) –Interpreted web content Deep web has a lot of structured data –How we get a handle on it is an interesting problem –But this is only part of the problem … lots of data not here Semantic web isn’t working Site-wrapping doesn’t scale Solutions? –Domain-wrapping –Mass collaboration –??


Download ppt "Web Data Management Raghu Ramakrishnan. - 2 - Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy."

Similar presentations


Ads by Google