Spreadsheet As a Relational Database Engine Jerzy Tyszkiewicz Institute of Informatics University of Warsaw.

Slides:



Advertisements
Similar presentations
Microsoft® Access® 2010 Training
Advertisements

Microsoft ® Office Access ® 2007 Training Build a database VI: Create reports for a new Access database ICT Staff Development presents:
Query Methods (SQL). What is SQL A programming language for databases. SQL (structured Query Language) It allows you add, edit, delete and run queries.
Utility SQL Bin (v3.3). Agenda  Purpose  Target User  Benefits  System Requirement  User Guide Introduction Navigation Add New SQL Add New Version.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Microsoft Access A Hands-On Introduction Chapter 4.
1 Query-by-Example (QBE). 2 v A “GUI” for expressing queries. –Based on the Domain Relational Calulus (DRC)! –Actually invented before GUIs. –Very convenient.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
Database Management Systems 3ed, Online chapter, R. Ramakrishnan and J. Gehrke1 Query-by-Example (QBE) Online Chapter Example is the school of mankind,
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query-by-Example (QBE) Chapter 6 Example is the school of mankind, and they will learn at no.
Lecture Microsoft Access and Relational Database Basics.
CSCI 150 Database Applications Chapter 1 – Getting Started.
Query Design Objectives of the Lecture : To learn a strategy for designing queries. To learn how to use relational algebra concepts to implement the strategy.
MIS2502: Data Analytics MySQL and SQL Workbench David Schuff
Graph Algebra with Pattern Matching and Aggregation Support 1.
Rohit Agarwal. Introduction Types of Profiling When should Data Profiling be done? General Model Methodology Conclusion References.
Chapter 4 Relational Databases and Enterprise Systems
Being All Ears Listen and Decode Listen and Respond Listen and Complete Listen and Judge Listen and Read Listen and Match Listen and Conclude.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
HAP 709 – Healthcare Databases SQL Data Manipulation Language (DML) Updated Fall, 2009.
The Mission of Information Systems Early days: “paperwork factories” to pay employees, bill customers, ship products etc. –Objectives of information systems.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
Implementing the Theory dBase Operations in MS Access.
Large scale IP filtering using Apache Pig and case study Kaushik Chandrasekaran Nabeel Akheel.
1 Relational Algebra and Calculas Chapter 4, Part A.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
Programming in R SQL in R. Running SQL in R In this session I will show you how to: Run basic SQL commands within R.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 17 A First Course in Database Systems.
When I want to work with SQL, I start off as if I am doing a regular query.
Copyright © Curt Hill Joins Revisited What is there beyond Natural Joins?
Information Integration By Neel Bavishi. Mediator Introduction A mediator supports a virtual view or collection of views that integrates several sources.
The Income Statement Lecture 1
IST 210 The Relational Language Todd S. Bacastow January 2004.
Database Management Systems (DBMS)
Steven Seida How Does an RDF Knowledge Store Compare to an RDBMS?
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Mining real world data RDBMS and SQL. Index RDBMS introduction SQL (Structured Query language)
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Database Management Systems 1 Raghu Ramakrishnan Relational Algebra Chpt 4 Jianping Fan.
Database Overview What is a database? What types of databases are there? How are databases more powerful than spreadsheets?
Day 5 - More Complexity With Queries Explanation of JOIN & Examples Explanation of JOIN & Examples Explanation & Examples of Aggregation Explanation &
INTRODUCTION DATABASE TO. Who Needs a Database?????? We all do!!!!!!!!
MYSQL AND MYSQL WORKBENCH MIS2502 Data Analytics.
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
IFS180 Intro. to Data Management Chapter 10 - Unions.
SQL (Structured Query Language)
Cleveland SQL Saturday Catch-All or Sometimes Queries
Databases We are particularly interested in relational databases
Query-by-Example (QBE)
Object-Oriented Database Management System (ODBMS)
Data Virtualization Tutorial… Semijoin Optimization
mysql and mysql workbench
Chapter 12: Query Processing
Tutorial 8 Objectives Continue presenting methods to import data into Access, export data from Access, link applications with data stored in Access, and.
Chapter 15 QUERY EXECUTION.
Azure's Performance, Scalability, SQL Servers Automate Real Time Data Transfer at Low Cost MINI-CASE STUDY “Azure offers high performance, scalable, and.
Query Optimization CS 157B Ch. 14 Mien Siao.
MIS2502: Data Analytics MySQL and SQL Workbench
Instructor 彭智勇 武汉大学软件工程国家重点实验室 电话:
Data Analysis Tools Session 10.
CPSC-608 Database Systems
MIS2502: Data Analytics MySQL and MySQL Workbench
B-Trees.
Spreadsheet As a Relational Database Engine
Presentation transcript:

Spreadsheet As a Relational Database Engine Jerzy Tyszkiewicz Institute of Informatics University of Warsaw

In the beginning there was data…

…and a query… SELECT name, AVG(income) FROM Incomes GROUP BY name HAVING COUNT(*)>3

…and a user I want to do that in a spreadsheet! I know Excel, I do not know Access MS Office with Access is more expensive There are no databases on the cloud I’m afraid of real big databases Illustration ChrisL_AK, Flickr

Bill Gates spoke about that user… A lot of users today find the true databases complex enough that they simply go into either the word processor, with the table-type capabilities, or into the spreadsheet, which I'd say is a little more typical, and use that as their way of structuring data. And, of course, you get a huge discontinuity because, as you want to do database-type operations, the spreadsheet isn't set up for that. And so then you have to learn a lot of new commands and move your data into another location.

…in his keynote speach at SIGMOD ‘98 What we'd like to see is that even if you start out in the spreadsheet, there's a very simple way then to bring in software that uses that data in a richer fashion, and so you don't see a discontinuity when you want to move up and do new things. But that's very easy to say that. It's going to require some breakthrough ideas to really make that possible.

Google spreadsheet can do that SQL-like syntax comfortable interface but no HAVING clause no JOIN no UNION, EXCEPT

Then there was more data…

…and another query… SELECT Families.id,Families.name,AVG(Incomes.income) FROM Families JOIN Incomes ON Families.id=Incomes.id GROUP BY Families.id,Families.name HAVING COUNT(*)>3

…and still the same user I want that again in a spreadsheet! Illustration ChrisL_AK, Flickr

Can spreadsheets do that? Google spreadsheet can do that! And OpenOffice! And gnumeric! And Excel! And almost every other spreadsheet, too!

General theory Theorem Every query in Relational Algebra can be implemented in a spreadsheet. Also every query in SQL can be implemented in a spreadsheet.

Main theoretical contribution Spreadsheets can: store relational data execute SQL queries Therefore: Spreadsheets are relational database engines

Performance in Excel no join many-to-one join time in seconds size of Incomes in thousands many-to-many join no Families

Main practical contributions in answer to Bill Gates (Excel) Spreadsheets can serve as low-end relational database engines Small databases of a few thousand tuples can be used in practice A method to offer databases on the cloud

Suggestions Elements of database methodology can be transferred to the spreadsheet design Need of optimization of certain spreadsheet functions

Related research Filling the gap between spreadsheets and databases from the database direction We fill that gap from the spreadsheet direction

Thank you!