Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS240A: Databases and Knowledge Bases Introduction

Similar presentations


Presentation on theme: "CS240A: Databases and Knowledge Bases Introduction"— Presentation transcript:

1 CS240A: Databases and Knowledge Bases Introduction
Carlo Zaniolo Department of Computer Science University of California, Los Angeles

2

3

4 Database Systems During late 60s
IMS and other hierarchical DBMSs Codasyl-compliant DBMSs using the network model Relational DBMS were proposed [by E.F. Codd] in the 70s 10+years of R&D led to Relational DBMSs and SQL Extraordinary success from a research and a commercial view point (IBM, Oracle, …) Relational DBMS were covered in CS143 But starting in the mid 80s, DBMSs have faced major technical and commercial challenges, forcing a major evolution in these systems---this is the topic of CS240A!

5 DBMS Vendors IBM. SystemR, DB2 Oracle MS SQL Server Smaller Players:
Sybase, Informix, Teradata/NCR

6 Changes and Challenges and
Expert Systems and rule-based computing and knowledge management: Deductive Databases and recursive queries Active databases and rules, New Applications and data types (e.g., spatio- temporal and multimedia information) Object Oriented databases Datablades and extenders The WEB and XML Publishing databases using XML XQuery: the new query language for XML data. Decision Support, Knowledge Discovery, Big Data, Machine Learning, …, Data Science OLAP applications Data Mining

7 Evolution of SQL Standards
SQL­89 and SQL2 (a.k.a. SQL­92): Strictly relational. SQL­3: working documents discussing new specs for O­R systems, but also for recursion, active rules, OLAPs and OLAP functions. SQL:1999, and with minor changes SQL:2003. But evolution continues: User-defined indexes, user-defined aggregates, XML, etc. In this course we investigate how SQL and relational systems are being extended to face the new applications. We will often study languages other than SQL as a framework for research.

8 The main Problem of SQL: Inadequate Expressive Power
For instance, SQL cannot support complex queries and recursion needed in several applications, such as Bill­of­Materials applications. Thus database applications are now developed in procedural languages with embedded SQL statements An impedance mismatch between SQL the host language (different data types programming paradigm) slows down application development and their execution. Two approaches to solve the problem: Making query language more powerful: deductive databases Extending programming languages with DB capabilities—this is approach taken by OO DBMSs and OR DBMSs

9 Expressive Power: Relational Completeness
All relational languages suffer from the same expressive-power problems: 1. Relational Algebra, 2. Domain Relational Calculus, 3. Tuple Relational Calculus, and 4. Non­recursive safe Datalog rules. These languages are equivalent in terms of the expressive power, and programs (I.e. queries) written in one language are easily mapped into programs written in another. The notion of Relational Completeness (RC) defines the class of queries expressible using relational algebra or, equivalently, using safe relational calculus queries. RC was proposed in the 70s as a minimum required for all database query languages (not met by most of query languages at that time) But nowadays RC is not enough!

10 Datalog SQL’s Close Relations
1. QBE (Query by Example): two­dimensional rendering of domain calculus 2. QUEL and SQL: in­line, keyword­based versions of tuple relational calculus---with extensions such as updates and aggregates. 3. Datalog: rule­oriented, logic­based refinement of domain calculus. Datalog is the best candidate for more powerful query languages because Its formal framework based on first order logic, It supports the rule­based programming paradigm, that is the key of expert systems and knowledge­based systems Similar to Prolog which is more procedural. Big Data have brought a renewed interest in Datalog.

11 The Bigger Picture Assemblers, Operating Systems (Early 60s …)
Languages and Compilers (Late 60s …) Information Management Systems and Data Base Management Systems (DBMS) (70s … GUIs (80s …) Networks (60s) and the WEB (90s) and beyond Year 2000 and beyond big data analytics 2010 and so… Datalog’s renaissance.

12 Workplan and Grade Basis
---Grade Basis for CS240A Midterm : 40% Homework and Assignements: 10% Final Projects and Reports 50% (XML 15%, DM 35%) Take home final Consists of two projects: The first project will be about supporting temporal queries in XML and JSON. The second project will ask you to write decision support queries in SQL and DeALS.


Download ppt "CS240A: Databases and Knowledge Bases Introduction"

Similar presentations


Ads by Google