CS457/557 Introduction – Chapters 1-2
Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data Warehouses – Decision Support – NoSQL DBs – Information
Databases Databases play a critical role in? – Business, medicine, industry, etc., – everything? Databases can be? – Traditional, XML, Object-relational, multimedia, real-time, Web, VERY large What databases have you used recently?
Data vs. Databases Data – Recorded known facts, implicit meaning Database (DB) – Collection of related data – Logically coherent – Represents mini-world – Designed, built for specific purpose – Intended user group – Preconceived applications
DBMS Database Management System (DBMS) – Software – Create and maintain a DB – Define types of data – Store on disk controlled by DBMS – Manipulate data
DBMS cont’d Why a DBMS? – Program-data independence – Data abstraction – Conceptual representation – Meta data – Share data – Multiple views – Transaction processing – Higher overhead Fig. 2.3 and increased complexity So why use a DBMS?Fig. 2.3 – OPTIMIZATION
Definitions Database System DBS – Data + DBMS DBS – Schema (meta-data) - DB description, schema diagram Fig 2.1Fig 2.1 – Instance (actual data) Fig initially emptyFig schema architecture Fig 2.2 Fig 2.2 – External view – Conceptual – structure of DB, hides physical – Internal – physical storage access paths
Data Model D escribes the structure records, types, relationships, constraints, basic operations DBMS based on data model Types: – High-level (conceptual) - ER, UML, OO – Low level (physical) - XML – Implementation (representational) combines conceptual and physical – Relational – NoSQL data models – Column, key-value, document stores
DBMS Languages DDL - data definition language DML - data manipulations language – High-level, nonprocedural – Set at a time – Interactive or embedded (host language) SQL most common/popular DB Language
DBMS Software to create, query, manipulate data in the database Based on a particular data model Allows for program-data independence Provides language to define, manipulate data Contains meta data
Meta Data Data about the data “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.” NISO
Meta Data Three categories of meta data (books as example): – Structural metadata: A way to define how objects are put together, for example, how pages are ordered to form chapters. – Administrative metadata: Information to help manage a resource, such as when and how it was created, types, and who has access – Descriptive metadata: A resource for discovery and identification, including elements such as title, abstract, author, and keywords.
Meta Data Structural – Student (Name, CWID, address, GPA, major) Administrative – Owner of data? Account#, when created, modified Descriptive: – Everything but the content – constraints, max/min values?
Meta Data Metadata associated with mobile phones: – Phone number of every caller – Time of call – Duration of call – Serial numbers of phones involved – Location of each participant – Telephone calling card numbers
Meta Data – According to the Guardian Metadata associated with s: – Sender's name, , and IP address – Recipient's name and address – Date, time, and time zone – Mail client header formats – Unique identifier of and related s – Mail client login records with IP address – Subject of
Meta Data Metadata associated with Facebook: – Username and unique identifier – User subscriptions – User device – Activity date, time, and time zone – User location – Username and profile bio information including: Birthday Hometown work history interests
Meta Data Metadata associated with web browsers: – Activity including pages the user visits and when visited – User data and possibly user login details with auto-fill features – User IP address, internet service provider, device hardware details, operating system, and browser version – Cookies and cached data from websites
Meta Data What about medical records?
Additional Characteristics Interfaces Actors – DBA – Designers – Users Naïve or parametric - same info each time Casual - different info each time Sophisticated - implement own applications using databases Standalone – personal DBs using ready-made packages that store personal data
DB classifications Single-user vs. multi-user Centralized vs. distributed Homogeneous vs. heterogeneous Federated DBMS, multidatabase system
DBS Utilities Loading – into DB, conversion tools Backup – copy on durable mass storage DB storage reorganization – of files to better performance Performance Monitoring – to reorganize, etc.
Extending traditional (relational) databases Need for more complex databases Images, videos, scientific Object-oriented databases Data mining (decision support systems), spatial Data on the web for e-commerce – XML Non or semi-structured data Databases for cloud computing
Application packages Software packages work with database backends (>1 database) Web enabled Examples – Enterprise Resource Planning (ERP) Integrate data and processes of organization Production, sales, distribution, marketing, finance, human resources, etc. – Customer Relationship Management (CRM) Integrate customer information Marketing and customer support
Information Retrieval IR Databases traditionally used for – Banking, insurance, retail, finance, manufacturing, payroll Information retrieval used for – Books, manuscripts, library Searching based on key-words document processing – keywords, categorization, ranking documents
Information Retrieval IR Advent of web, IR is exciting again! – Web pages have active objects, change dynamically – New strategies needed Big Data NoSQL
DB Management Issues This course 457/557 – Design/Model DBs Weird course – theory + applications – Relational: Query DBs, Algebra, Normalization We will use Oracle, MySQL – Intro to: Security, performance, transactions, NoSQL Grad course 609 – Redundancy – Integrity constraints and concurrency control (transactions) – Backup and recovery – In depth: performance, NoSQL