IS 325 Notes for Thursday September 7, 2017
Data as a Resource Proper delivery of information not only depends on the capabilities of the computer hardware and software but also on the organization’s ability to manage data as an important organizational resource.
Topics DBMS and DB Applications Database, Data, and System Administration Database professionals DBA Tasks Types of DBAs Test and Production Environments
DB Applications Data is the lifeblood of computerized applications In many ways, business today is data Using DBMS is the efficient way for data persistence and manipulation Database professionals are at the center of the development lifecycle
Database vs. DBMS Database DBMS An organized store of data Data can be accessed by names DBMS Software that enable users or programmers to share and manage data
Enterprise IT Infrastructure – a big picture
Application Development Lifecycle
Data Administration Concentrate on the business aspects Business lexicon logical data model Requirements gathering, analysis, and design Typical tasks include Identifying and cataloging business data Producing conceptual and logical data models Creating enterprise data model Setting data policies and standards Concerns more about metadata
Metadata Metadata is often described as data of data Definition Business name Abbreviation Type and length/accuracy Domain, or range of valid values
Data Models – three levels Conceptual model Outlines data requirements at a very high level Describes data mostly in business context Logical model Provides in-depth details of data types, lengths, relationships, and cardinality Physical model Defines the way data is organized in physical medium
DBA vs. DA DBAs have to take care of the first two levels if no DA roles are implemented in an organization
System Administration SAs are more concerned about the installation and setup of DBMS Typical SA tasks include Underlying OS systems DBMS Installation, modification, and support System configurations enabling DBMS to work with other software systems
DBA Tasks Ensuring data and databases are useful, usable, available, and correct Typical DBA tasks include DB design and implementation Performance monitoring and tuning Availability DB security and authorization Backup and recovery Data integrity DBMS release migration
DB Design & Implementation Understand and adhere to sound relational design principles Relational theories and ER diagrams DBMS specifics Understanding conceptual/logical models and being able to transform to physical DB implementation Poor design can result in poor performance
Performance Monitoring & Tuning Performance = the rate at which the DBMS supplies info to its users Influenced by five factors Workload Throughput Resources Optimization Contention
Availability Multifaceted process Keep the DBMS up and running Minimize the downtime required for admin tasks Technologies and up-front planning can help
DB Security & Authorization Ensure data is available only to authorized users by granting privileges to different (groups of) users Actions need to be controlled Creating/altering DB objects and/or their structures Reading/modifying data from tables Starting/stopping DB and/or associated objects Running stored procedures or DB utilities
Backup & Recovery Be prepared to recover DB in the event of Improper shutdown of DB applications, due to Software error Human error Hardware failure Types of recovery Recover to current Point-in-time recovery Transaction recovery
Data Integrity Store the correct data in the correct way Physical integrity domains and data types Semantic integrity quality data with no redundancy Internal integrity internal structures and code
Database Pro Desired Skill Set SQL + programming languages (PL/SQL, others) System specific operations and practices Data modeling methodologies and tools Networking (client/server) O/S Programming (conventional and web-oriented) Transactional/messaging systems
Types of Database Pros System DBA DB architect DB analyst Data modeler Application DBA Task-oriented DBA Data warehouse administrator
Test & Production At least two (and perhaps three) separate environments must be created for quality DB implementation Testing (aka development) QA (aka staging) Production Differences They should share the same configuration They don’t need to be identical Testing DB may have only a subset of data
Multiple DB Environments
Traditional Administration Definitions Data Administration: A high-level function that is responsible for the overall management of data resources in an organization, including maintaining corporate-wide definitions and standards Database Administration: A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery
Traditional Data Administration Functions Data policies, procedures, standards Planning Data conflict (ownership) resolution Managing the information repository Internal marketing of DA concepts
Traditional Database Administration Functions Selection of DBMS and software tools Installing/upgrading DBMS Tuning database performance Improving query processing performance Managing data security, privacy, and integrity Data backup and recovery
Evolving Approaches to Data Administration Blend data and database administration into one role Fast-track development – monitoring development process (analysis, design, implementation, maintenance) Procedural DBAs–managing quality of triggers and stored procedures eDBA–managing Internet-enabled database applications PDA DBA–data synchronization and personal database management Data warehouse administration
Data Warehouse Administration Similar to DA/DBA roles Emphasis on integration and coordination of metadata/data across many data sources Specific roles: Support DSS applications Manage data warehouse growth Establish service level agreements regarding data warehouses and data marts
Open Source DBMSs An alternative to proprietary packages such as Oracle, Microsoft SQL Server, or Microsoft Access mySQL is an example of open-source DBMS Less expensive than proprietary packages Source code available, for modification
Database Security Database Security: Protection of the data against accidental or intentional loss, destruction, or misuse Increased difficulty due to Internet access and client/server technologies
Locations of data security threats
Threats to Data Security Accidental losses attributable to: Human error Software failure Hardware failure Theft and fraud Improper data access: Loss of privacy (personal data) Loss of confidentiality (corporate data) Loss of data integrity Loss of availability (through, e.g. sabotage)
Internet security
Web Security Static HTML files are easy to secure Standard database access controls Place Web files in protected directories on server Dynamic pages are harder Control of CGI scripts User authentication Session security SSL for encryption Restrict number of users and open ports Remove unnecessary programs
Database Software Security Features Views or subschemas Integrity controls Authorization rules User-defined procedures Encryption Authentication schemes Backup, journalizing, and checkpointing
Views and Integrity Controls Subset of the database that is presented to one or more users User can be given access privilege to view without allowing access privilege to underlying tables Integrity Controls Protect data from unauthorized use Domains–set allowable values Assertions–enforce database conditions
Authorization Rules Controls incorporated in the data management system Restrict: access to data actions that people can take on data Authorization matrix for: Subjects Objects Actions Constraints
Authorization Matrix
Implementing Authorization Rules Authorization table for subjects (salespeople) Authorization table for objects (orders)
DBMS Privileges
Authentication Schemes Goal – obtain a positive identification of the user Passwords: First line of defense Should be at least 8 characters long Should combine alphabetic and numeric data Should not be complete words or personal information Should be changed frequently
Strong Authentication Passwords are flawed: Users share them with each other They get written down, could be copied Automatic logon scripts remove need to explicitly type them in Unencrypted passwords travel the Internet
Strong Authentication Possible solutions: Two factor–e.g. smart card plus PIN Three factor–e.g. smart card, biometric, PIN Biometric devices–use of fingerprints, retinal scans, etc. for positive ID Third-party mediated authentication–using secret keys, digital certificates
Security Policies and Procedures Personnel controls Hiring practices, employee monitoring, security training Physical access controls Equipment locking, check-out procedures, screen placement Maintenance controls Maintenance agreements, access to source code, quality and availability standards Data privacy controls Adherence to privacy legislation, access rules
Database Recovery Mechanism for restoring a database quickly and accurately after loss or damage Recovery facilities: Backup Facilities Journalizing Facilities Checkpoint Facility Recovery Manager
Back-up Facilities Automatic dump facility that produces backup copy of the entire database Periodic backup (e.g. nightly, weekly) Cold backup–database is shut down during backup Hot backup–selected portion is shut down and backed up at a given time Backups stored in secure, off-site location
Journalizing Facilities Audit trail of transactions and database updates Transaction log–record of essential data for each transaction processed against the database Database change log–images of updated data Before-image–copy before modification After-image–copy after modification Produces an audit trail
Audit trails From the backup and logs, databases can be restored in case of damage or loss
Checkpoint Facilities DBMS periodically refuses to accept new transactions system is in a quiet state Database and transaction logs are synchronized This allows recovery manager to resume processing from short period, instead of repeating entire day