Databases and Data Access Mark Sapossnek CS 594 Computer Science Department Metropolitan College Boston University
Objectives Review database theory and history Review relational database concepts Learn about the evolution of data access technologies Learn about the ADO.NET namespaces and core classes Learn how to use ADO.NET classes in an application
Agenda Databases Relational Databases ADO.NET Overview ADO.NET Classes
Databases Databases Virtually all interesting applications require a structured, persistent data store E-Commerce: placing an order, fulfilling an order HR: Personnel data Sales CRM: Customer data Games Database needs vary with the type of application Transaction Processing/OLTP Business Intelligence/Data Warehouse/OLAP
Databases Database Requirements Can store, view and modify data Can move, copy and transform data Can backup and restore data Enforces data integrity rules Is scaleable and available High number of users Lots of data High throughput with low response time Is secure Facilitates application development
Databases Evolution of Database Technology File-based Hierarchical Network Relational (RDBMS) Object-oriented XML
Agenda Databases Theory and History Relational Databases ADO.NET Overview ADO.NET Classes
Relational Databases Tables Table (relation, entity) A collection of data about a specific type of thing Organized in rows and columns Column (attribute, field) Describes part of an entity (e.g. FirstName) Has a data type (e.g. integer, character, binary) Can be null Row (tuple, record) A single instance of data in a table Each row is unique AuthID FirstName LastName 1 Joe Smith 2 Diane Jones
Relational Databases Relating Data Tables can be related through primary/foreign key relationships (e.g., a book has an author) Primary key Guarantees the uniqueness of a row Can be composed of one or more columns Ensures entity integrity Foreign key Establishes logical relationship between tables One or more columns of a table that match the primary or alternate key of another table Referential integrity Relational databases do NOT get there name because they can relate data. They get their name from set theory: a relation is the set-theoretic term for a table. Primary/foreign key relationships are THE way data is related in a relational database.
Relational Databases Relating Data Schema diagram depicts tables, columns, primary keys, foreign keys Books BookID AuthID Title Type Authors AuthID FirstName LastName 1 ∞ Schema Diagram
Relational Databases Relating Data Books Table Primary Key BookID AuthID Title Type 1 2 My Life as a DBA Autobiography Database Handbook Reference PK/FK Relationship Foreign Key AuthID FirstName LastName 1 Joe Smith 2 Diane Jones Authors Table
Relational Databases Types of Relationships One-to-One (1:1) One row in table X matches one row in table Y A book has at most one Library of Congress entry One-to-Many (1:M) One row in table X matches 0+ rows in table Y A publisher publishes one or more books Many-to-Many (M:N) 1+ rows in table X matches 1+ rows in table Y An author writes one or more books; a book is written by one or more authors 1 1 Books LoC Entries 1 M Publishers Books M N Authors Books
Relational Databases M:N Relationships More complex Can result in very large tables (repeated data) Difficult to ensure data integrity The remedy: Create a third table The third table contains the primary key of the two original tables in a composite key Data is repeated in the third table, but not in the two original tables Authors 1 M BookAuth M 1 Books
Relational Databases M:N Relationships 1 ∞ Data is duplicated here 1 1 ∞ ∞
Relational Databases Normalization/Denormalization The process of breaking large tables into multiple smaller tables Goal: minimize redundant data, maximize correctness Improves performance for updates Desirable in transaction-based applications Denormalization The process of combining smaller tables into fewer larger tables Goal: improve performance Introduces redundant data Improves performance for reads Desirable in data warehouse applications
Relational Databases Joins A join is a way of combining data in multiple tables, usually by resolving primary key/foreign key relationships Product table Vendor table Product Cost Vendor Vendor State Contact Widget $10 Acme Acme MA Linda A. Thingy $5 Acme Blecco WA Adam P. Widget $8 Blecco Foobar $25 Blecco
Relational Databases Joins Result of a natural join Product Cost Vendor State Contact Widget $10 Acme MA Linda A. Thingy $5 Acme MA Linda A. Widget $8 Blecco WA Adam P. Foobar $25 Blecco WA Adam P.
Relational Databases Structured Query Language (SQL) Standard language for accessing a relational database, standardized by American National Standards Institute (ANSI); SQL-92 Open, but not really Common functions are mostly the same across products Most vendors have proprietary extensions Subsets of SQL Data Definition Language (DDL) Data Manipulation Language (DML) Data Control Language (DCL)
Relational Databases DDL Examples Used to create and modify database objects CREATE DATABASE Bookstore CREATE TABLE tBooks ( BookID INT IDENTITY(1,1) PRIMARY KEY, Title VARCHAR(30) NOT NULL, PubDate DATE NOT NULL, [Description] VARCHAR(50), Category INT NOT NULL )
Relational Databases DML Examples Select data to view SELECT * FROM tAuthors SELECT AuthID, FirstName, LastName FROM tAuthors SELECT AuthID, FirstName, LastName, Phone FROM tAuthors WHERE City = ‘Boston’ SELECT FirstName, LastName, Phone FROM tAuthors WHERE AuthID = 249
Relational Databases DML Examples Using SELECT to join tables SELECT AuthID, FirstName, LastName, Phone, BookID, Title, PubDate, Description FROM tAuthors, tBooks WHERE tAuthors.AuthID = tBooks.AuthID SELECT AuthID, FirstName, LastName, Phone, BookID, Title, PubDate, Description FROM tAuthors INNER JOIN tBooks ON tAuthors.AuthID = tBooks.AuthID
Relational Databases DML Examples Insert, update and delete data INSERT INTO tBooks (Title, PubDate, [Description], Category) VALUES (‘Database Design’, GETDATE(), ‘How to design a database’, 3) UPDATE tAuthors SET Phone = ‘617-555-1234’ WHERE AuthID = 5 DELETE FROM tAuthors WHERE AuthID = 5
Relational Databases DCL Examples Set security options on database objects GRANT INSERT, UPDATE, DELETE ON tAuthors TO Mary, John REVOKE CREATE TABLE FROM Joe DENY ALL ON tAuthors, tBooks TO Sally
Relational Databases Views A view is a virtual table Abstracts the underlying table structures Abstracts a (possibly complex) query Provides security abstraction from table In SQL Server 2000, a view can be Indexed Updated and inserted into
Relational Databases View Definition Example CREATE VIEW vwCustomerOrders AS SELECT o.OrderId, c.CompanyName FROM Customers c INNER JOIN Orders o ON c.CustomerID = O.CustomerID ORDER BY o.OrderId
Relational Databases View Usage Example SELECT * FROM vwCustomerOrders WHERE CompanyName = 'My Favorite Customer' OrderId CompanyName 101 My Favorite Customer 137 …
Relational Databases Stored Procedures A group of SQL statements that runs within the database Not part of SQL standard Provides greater performance Can control access to data Can accept parameters Can return data Output parameters Return values Result set
Relational Databases Stored Procedure Example CREATE PROCEDURE CustOrderHist @CustomerID nchar(5) AS SELECT ProductName, Total=SUM(Quantity) FROM Products P, [Order Details] OD, Orders O, Customers C WHERE C.CustomerID = @CustomerID AND C.CustomerID = O.CustomerID AND O.OrderID = OD.OrderID AND OD.ProductID = P.ProductID GROUP BY ProductName
Relational Databases Stored Procedure Examples exec CustOrderHist 'alfki' ProductName Total Aniseed Syrup 6 Chartreuse verte 21 ...
Relational Databases Stored Procedure Examples Use RETURN statement to return status 0 is default in SQL Server Can only be numeric Use OUTPUT parameters to return results RETURN 1 CREATE PROCEDURE MyProcedure @ReturnValue INT OUTPUT ... SELECT @ReturnValue = ColumnName FROM Table
Relational Databases Triggers Like stored procedures, triggers are code that runs within a database Not directly called by a user Executed when a specified data modification takes place (INSERT, UPDATE or DELETE) Enforces business rules FOR AFTER: trigger executes after triggering action completes FOR INSTEAD OF: trigger executes in place of triggering action
Relational Databases Transactions Transaction: a sequence of SQL statements that constitute a logical unit of work Must adhere to ACID properties Atomic: All statements execute successfully or all fail Consistent: Must leave the data in a consistent state when completed Isolated: Cannot see the modifications made by concurrent transactions Durable: Must be permanent when complete, even in the event of system failure
Relational Databases Concurrency Isolation levels Read Uncommitted Read Committed Repeatable Read Serializable Tradeoffs (concurrency vs. data integrity) Locking Ensures transactional integrity/database consistency Prevents users from seeing “phantom” data Can result in deadlocks
Agenda Databases Relational Databases ADO.NET Overview ADO.NET Classes
ADO.NET Overview Looking Back ODBC (Open Database Connectivity) Interoperability to a wide range of database management systems (DBMS) Widely accepted API Uses SQL as data access language DAO (Data Access Objects) Programming interface for JET/ISAM databases Uses automation (ActiveX, OLE automation) RDO (Remote Data Objects) Tighter coupling to ODBC Geared more to client/server databases (vs. DAO)
ADO.NET Overview Looking Back OLE DB Broad access to data, relational and other Built on COM Not restricted to SQL for retrieving data Can use ODBC drivers Low-level (C++) interface ADO (ActiveX Data Objects) Simple component-based, object-oriented interface Provides a programming model to OLE DB accessible outside of C++
ADO.NET Overview Looking Back Your Application ADO OLE DB ODBC Provider Simple Provider Native Provider ODBC OLE DB Provider OLE DB Provider ODBC Driver Text File Mainframe Database Database
ADO.NET Overview Looking Back ADO was designed as a connected, tightly coupled model Appropriate for client/server architectures Primarily relational (not hierarchical like XML) Object design is not well factored Too many ways to do the same thing Objects try to do too much Not originally designed for a distributed, n-tier environment
ADO.NET Overview What Is ADO.NET? ADO .NET is a collection of classes, interfaces, structures, and enumerated types that manage data access from relational data stores within the .NET Framework These collections are organized into namespaces: System.Data, System.Data.OleDb, System.Data.SqlClient, etc. ADO .NET is an evolution from ADO. Does not share the same object model, but shares many of the same paradigms and functionality!
ADO.NET Overview ADO.NET Goals Well-factored design Highly scaleable through a robust disconnected model Rich XML support (hierarchical as well as relational) Data access over HTTP Maintain familiar ADO programming model Keep ADO available via .NET COM interoperability
ADO.NET Overview Managed Providers Merges ADO and OLEDB into one layer Each provider contains a set of classes that implement common interfaces Initial managed provider implementations: ADO Managed Provider: provides access to any OLE DB data source SQL Server Managed Provider: provides optimal performance when using SQL Server Exchange Managed Provider: retrieve and update data in Microsoft Exchange
ADO.NET Overview Managed Providers Your Application ADO.NET Managed Provider SQL Managed Provider ADO Managed Provider OLE DB Provider SQL Server Database Database
ADO.NET Overview Data Access Styles Connected: Forward-only, read-only Application issues query then reads back results and processes them “Firehose” cursor DataReader object Disconnected Application issues query then retrieves and stores results for processing Minimizes time connected to database DataSet object The third main style is connected, scrollable cursor. ADO.NET does not currently support this style, though it may be supported in a future version. This can be implemented by using ADO.
ADO.NET Overview Data Binding Key component of Web Forms framework Flexible and easy to use Bind a control’s property to information in any type of data store Provides control over how data moves back and forth Simple controls for displaying a single value Complex controls for displaying a data structure We will cover this when we discuss ASP.NET and Web Forms. <asp:Label runat=server Text='<%# CustList(0).FirstName %>'/>
Agenda Database Theory and History Relational Database Concepts and Terminology ADO.NET Overview ADO.NET Classes
ADO.NET Classes IDbConnection Interface Creates a unique session with a data source Implemented by SqlDbConnection and OleDbConnection Functionality Open, close connections Begin transactions IDbTransaction provide Commit and Rollback methods Used in conjunction with IDbCommand and IDataAdapter objects Additional properties, methods and collections depend on the provider Very similar to ADO counterpart. Database access protocols tend to be chatty.
ADO.NET Classes IDbCommand Interface Represents a statement to be sent to a data source Usually, but not necessarily SQL Implemented by OleDbCommand and SqlCommand Functionality Define statement to execute Execute statement Pass and retrieve parameters Create a prepared (compiled) version of command ExecuteReader returns rows, ExecuteNonQuery doesn’t, ExecuteScalar returns single value Additional properties, methods and collections depend on the provider Very similar to ADO counterpart. Use ExecuteNonQuery whenever possible.
ADO.NET Classes IDataReader Interface Forward-only, read-only (“fire hose”) access to a stream of data Implemented by SqlDataReader and OleDbDataReader Created via ExecuteReader method of IDbCommand Operations on associated IDbConnection object disallowed until reader is closed
ADO.NET Classes System.Data.OleDb Namespace Managed provider for use with OLEDB providers SQLOLEDB (SQL Server) – use System.Data.SQL MSDAORA (Oracle) JOLT (Jet) OLEDB for ODBC providers OleDbConnection, OleDbCommand and OleDbDataReader classes Classes for error handling Classes for connection pooling
ADO.NET Classes DataReader Example string sConnString = “Provider=SQLOLEDB.1;” + “User ID=sa;Initial Catalog=Northwind;” + “Data Source=MYSERVER”; OleDbConnection conn = new OleDbConnection(sConnString); conn.Open(); string sQueryString = “SELECT CompanyName FROM Customers”; OleDbCommand myCommand = new OleDbCommand(sQueryString, conn); OleDbDataReader myReader = myCommand.ExecuteReader(); while (myReader.Read()) { Console.WriteLine(myReader.GetString(0)); } myReader.Close(); conn.Close();
ADO.NET Classes DataReader, Insert Demos
ADO.NET Classes System.Data Namespace Contains the core classes of the ADO.NET architecture Disconnected DataSet is central Supports all types of applications Internet based ASP.NET XML Windows forms based
ADO.NET Classes System.Data Namespace Contains classes used by or derived from managed providers IDbConnection, IDbCommand, IDbDataReader
ADO.NET Classes DataSet A collection of tables Has no knowledge of the source of the data Keeps track of all relationships among tables Rich programming model (has objects for tables, columns, relationships, and so on) Remembers original and current state of data Can dynamically modify data and metadata Native serialization format is XML Located in System.Data
ADO.NET Classes DataSet DataTable DataColumn This is a simplified view of the containment relationships in a DataSet. There are actually many other collections and types of objects involved. DataRow DataRelation
ADO.NET Classes System.Data.SqlClient Namespace Managed provider native to SQL Server Built on TDS (Tabular Data Stream) for high performance in SQL Server SqlConnection, SqlCommand and SqlDataReader classes Classes for Error handling Connection pooling (implicitly enabled by default ) System.Data.SqlTypes provides classes for native SQL Server data types
ADO.NET Classes IDataAdapter Interface Populates or sends updates to a DataSet Implemented by OleDbDataAdapter and SqlDataAdapter Not connection based Represents an asynchronous approach A superset of a command object Contains four default command objects for Select, Insert, Update, and Delete
ADO.NET Classes DataSet Example string sConnString = “Persist Security Info=False;” + “User ID=sa;Initial Catalog=Northwind;” + “Data Source=MYSERVER”; SqlConnection conn = new SqlConnection(sConnString); conn.Open(); string sQueryString = “SELECT CompanyName FROM Customers”; SqlDataAdapter myDSAdapter = new SqlDataAdapter(); DataSet myDataSet = new DataSet(); myDSAdapter.SelectCommand = new SqlCommand(sQueryString, conn); myDSAdapter.Fill(myDataSet); conn.Close();
ADO.NET Classes DataSet Demo
ADO.NET Classes Stored Procedure Demo
ADO.NET Classes DataTable In-memory object representing one table Columns Rows Schema defined by Columns collection Data integrity provided through Constraint objects Public events Modifying/deleting rows Modifying columns
ADO.NET Classes DataColumn Fundamental building block of a DataTable schema (contained in Columns collection) Defines what type of data may be entered (via DataType property) Other important properties include AllowNull, Unique, and ReadOnly Can contain Constraints (a collection on DataTable) Can contain Relations (collection on DataSet)
ADO.NET Classes DataRow Represents data in a DataTable (contained in Rows collection) Conforms to schema defined by DataColumns Properties for determining row state (e.g., new, changed, deleted, etc.) All additions/modifications “committed” with AcceptChanges method of DataTable
ADO.NET Classes DataRelation Relates two DataTables via DataColumns DataType value of both DataColumns must be identical Updates can be cascaded to child DataTables Modifications that invalidate the relation are disallowed
ADO.NET Classes Creating a DataSet in Code Create DataSet Define tables DataSet dataset = new DataSet(); dataset.DataSetName = “BookAuthors”; DataTable authors = new DataTable(“Author”); DataTable books = new DataTable(“Book”);
ADO.NET Classes Creating a DataSet in Code Define columns Define keys DataColumn id = authors.Columns.Add("ID", typeof(Int32)); id.AutoIncrement = true; authors.PrimaryKey = new DataColumn[] {id}; DataColumn name = new authors.Columns.Add("Name",typeof(String)); DataColumn isbn = books.Columns.Add("ISBN", typeof(String)); books.PrimaryKey = new DataColumn[] {isbn}; DataColumn title = books.Columns.Add("Title", typeof(String)); DataColumn authid = books.Columns.Add(“AuthID”,typeof(Int32)); DataColumn[] foreignkey = new DataColumn[] {authid};
ADO.NET Classes Creating a DataSet in Code Add the tables to the DataSet dataset.Tables.Add (authors); dataset.Tables.Add (books);
ADO.NET Classes Creating a DataSet in Code Add data and save the DataSet DataRow shkspr = authors.NewRow(); shkspr["Name"] = "William Shakespeare"; authors.Rows.Add(shkspr); DataRelation bookauth = new DataRelation("BookAuthors", authors.PrimaryKey, foreignkey); dataset.Relations.Add (bookauth); DataRow row = books.NewRow(); row["AuthID"] = shkspr["ID"]; row["ISBN"] = "1000-XYZ"; row["Title"] = "MacBeth"; books.Rows.Add(row); dataset.AcceptChanges();
ADO.NET Classes DataSet Creation Demo
ADO.NET Classes Typed DataSets Derived from base DataSet class Uses XML schema to generate new class Tables, columns, etc. compiled into new class ds.Customers.FirstName Untyped DataSet No built-in schema Tables, columns, etc. exposed only as collections ds.Tables[“Customers”].Rows[0][“FirstName”]
ADO.NET Classes Errors and Exceptions Error class Contains information on an error or warning returned by data source Created and managed by Errors class Errors class Contains all errors generated by an adapter Created by Exception class Exception class Created whenever an unhandled error occurs Always contains at least one Error instance
ADO.NET Classes Errors and Exceptions Example try { DataTable myTable = new DataTable(); myTable.Columns.Add(“myCol”); //whoops! } catch (DataException myException) { Console.WriteLine ("Message: " + myException.Message + "\n" + "Source: " + myException.Source + "\n" + “Stack Trace: " + myException.StackTrace + "\n");
ADO.NET Classes DataException Demo
Conclusion Database Theory and History Relational Database Concepts and Terminology ADO.NET Overview ADO.NET Classes
Resources Introducing ADO+ ADO.NET http://msdn.microsoft.com/msdnmag/issues/1100/adoplus/adoplus.asp ADO.NET http://msdn.microsoft.com/library/default.asp?URL=/library/dotnet/cpguide/cpconaccessingdata.htm ADO+ Guides the Evolution of the Data Species http://msdn.microsoft.com/library/techart/adoplus.htm ADO.NET for the ADO Programmer http://msdn.microsoft.com/library/techart/adonetdev.htm ADO Rocks and Rolls in .NET Applications http://msdn.microsoft.com/library/welcome/dsmsdn/data02222001.htm Meditating on OLE DB and .NET http://msdn.microsoft.com/library/welcome/dsmsdn/data03222001.htm
Resources Reading Data Reader Secrets Database-like Data Containers http://msdn.microsoft.com/library/welcome/dsmsdn/data04122001.htm Database-like Data Containers http://msdn.microsoft.com/library/default.asp?URL=/library/welcome/dsmsdn/data04122001.htm ADO http://msdn.microsoft.com/library/default.asp?URL=/library/psdk/dasdk/ados4piv.htm Universal Data Access http://www.microsoft.com/data/ SQL Server http://www.microsoft.com/sql/default.asp
Appendix ADO vs. ADO.NET ADO is a slower automation layer over OLE DB for use in Visual Basic, etc. ADO.NET provides direct, fast access to data from any language ADO.NET essentially has merged OLE DB and ADO into a single layer
Appendix ADO vs. ADO.NET Feature ADO ADO.NET Memory-resident Data Representation Uses RecordSet, which can contain one table Uses DataSet, which can contain one or more tables represented by DataTables Relationship Between Multiple Tables Require the JOIN query Supports the DataRelation object Data Visitation Scans RecordSet rows sequentially Uses a navigation paradigm for non-sequential access Disconnected Access Provided by RecordSet but typically supports connected access Communicates with standardized calls to the DataAdapter KEY MESSAGE: SLIDE BUILDS: None SLIDE SCRIPT: SLIDE TRANSISTION: ADDITIONAL INFORMATION FOR PRESENTER:
Appendix ADO vs. ADO.NET Feature ADO ADO.NET Programmability Uses Connection object to transmit commands Uses strongly typed programming characteristics of XML Sharing Disconnected Data Between Tiers or Components Uses COM marshalling to transmit disconnected Recordset Transmits a DataSet with an XML file Transmitting Data Through Firewalls Problematic because firewalls are typically configured to prevent system-level requests Supported, DataSet object use XML, which can pass through firewalls Scalability Database locks and active database connections for long durations Disconnected access to database data without retaining database locks KEY MESSAGE: SLIDE BUILDS: None SLIDE SCRIPT: SLIDE TRANSISTION: ADDITIONAL INFORMATION FOR PRESENTER: