Download presentation
Presentation is loading. Please wait.
Published byMelvin Evans Modified over 9 years ago
1
1 Module 3 The concept of data processing Major issues in database management
2
2 Learning Objectives Explain the importance of implementing data resource management processes and technologies in an organization. Understand the advantages of a database management approach to managing the data resources of a business.
3
3 Learning Objectives (continued) Explain how database management software helps business professionals and supports the operations and management of a business. Illustrate each of the following concepts: Major types of databases Data warehouses and data mining Logical data elements Fundamental database structures Database access methods Database development
4
4 Section I Managing Data Resources
5
5 Data Resource Management A managerial activity Applies information systems technology to managing data resources to meet needs of business stakeholders.
6
6 Foundation Data Concepts Levels of data Character Single alphabetical, numeric, or other symbol Field Groupings of characters Represents an attribute of some entity
7
7 Foundation Data Concepts (continued) Records Related fields of data Collection of attributes that describe an entity Fixed-length or variable-length
8
8 Foundation Data Concepts (continued) Files (table) A group of related records Classified by Primary use Type of data permanence
9
9 Foundation Data Concepts (continued) Database Integrated collection of logically related data elements Consolidates records into a common pool of data elements Data is independent of the application program using them and type of storage device
10
10 Foundation Data Concepts (continued) Logical Data Elements
11
11 Types of Databases Operational Supports business processes and operations Also called subject-area databases, transaction databases, and production databases
12
12 Types of Databases (continued) Distributed Replicated and distributed copies or parts of databases on network servers at a variety of sites. Done to improve database performance and security
13
13 Types of Databases (continued) External Available for a fee from commercial sources or with or without charge on the Internet or World Wide Web Hypermedia Hyperlinked pages of multimedia
14
14 Data Warehouses and Data Mining Data warehouse Stores data extracted from operational, external, or other databases of an organization Central source of “structured” data May be subdivided into data marts
15
15 Data Warehouses and Data Mining (continued) Data mining A major use of data warehouse databases Data is analyzed to reveal hidden correlations, patterns, and trends
16
16 Database Management Approach Consolidates data records and objects into databases that can be accessed by many different application programs
17
17 Database Management Approach (continued) Database Management System Software interface between users and databases Controls creation, maintenance, and use of the database
18
18 Database Management Approach (continued)
19
19 Database Management Approach (continued) Database Interrogation Query Supports ad hoc requests Tells the software how you want to organize the data SQL queries Graphical (GUI) & natural queries
20
20 Database Management Approach (continued) Report Generator Turns results of query into a useable report Database Maintenance Updating and correcting data
21
21 Database Management Approach (continued) Application Development Data manipulation language Data entry screens, forms, reports, or web pages
22
22 Implementing Data Resource Management Database Administration Develop and maintain the data dictionary Design and monitor performance of databases Enforce database use and security standards
23
23 Implementing Data Resource Management (continued) Data Planning Corporate planning and analysis function Developing the overall data architecture
24
24 Implementing Data Resource Management (continued) Data Administration Standardize collection, storage, and dissemination of data to end users Focused on supporting business processes and strategic business objectives May include developing policy and setting standards
25
25 Implementing Data Resource Management (continued) Challenges Technologically complex Vast amounts of data Vulnerability to fraud, errors, and failures
26
26 Section II Technical Foundations of Database Management
27
27 Database Structures Hierarchical Treelike One-to-many relationship Used for structured, routine types of transaction processing
28
28 Database Structures (continued) Network More complex Many-to-many relationship More flexible but doesn’t support ad hoc requests well
29
29 Database Structures (continued) Relational Data elements stored in simple tables Can link data elements from various tables Very supportive of ad hoc requests but slower at processing large amounts of data than hierarchical or network models
30
30 Database Structures (continued) Multi-Dimensional A variation of the relational model Cubes of data and cubes within cubes Popular for online analytical processing (OLAP) applications
31
31 Database Structures (continued)
32
32 Database Structures (continued) Object-oriented Key technology of multimedia web-based applications Good for complex, high-volume applications
33
33 Database Structures (continued)
34
34 Accessing Databases Key fields (primary key) A field unique to each record so it can be distinguished from all other records in a table
35
35 Accessing Databases (continued) Sequential access Data is stored and accessed in a sequence according to a key field Good for periodic processing of a large volume of data, but updating with new transactions can be troublesome
36
36 Accessing Databases (continued) Direct access Methods Key transformation Index Indexed sequential access
37
37 Database Development Data dictionary Directory containing metadata (data about data) Structure Data elements Interrelationships Information regarding access and use Maintenance & security issues
38
38 Database Development (continued) Data Planning & Database Design Planning & Design Process Enterprise model Entity relationship diagrams (ERDs) Data modeling Develop logical framework for the physical design
39
39 Discussion Questions How should an e-business enterprise store, access, and distribute data & information about their internal operations & external environment? What roles do database management, data administration, and data planning play in managing data as a business resource?
40
40 Discussion Questions (continued) What are the advantages of a database management approach to organizing, accessing, and managing an organization’s data resources? What is the role of a database management system in an e-business information system?
41
41 Discussion Questions (continued) Databases of information about a firm’s internal operations were formerly the only databases that were considered to be important to a business. What other kinds of databases are important for a business today? What are the benefits and limitations of the relational database model for business applications?
42
42 Discussion Questions (continued) Why is the object-oriented database model gaining acceptance for developing applications and managing the hypermedia databases at business websites? How have the Internet, intranets, extranets, and the World Wide Web affected the types and uses of data resources available to business end users?
43
43 Real World Case 1 – IBM versus Oracle What key business strategies did Janet Perna implement to help IBM catch up to Oracle in the database management software market? What is the business case for both IBM’s and Oracle’s product strategy for their database software?
44
44 Real World Case 1 (continued) Which approach would you recommend to a company seeking a database system today? What do you see as the key factor to IBM’s success?
45
45 Real World Case 1 (continued) The case states that “database software has become more of a commodity.” Do you agree?
46
46 Real World Case 2 – Experian Automotive How do the database software tools discussed in this case help companies exploit their data resources? What is the business value of the automotive database created by Experian?
47
47 Real World Case 2 (continued) What other business opportunities could you recommend to Experian that would capitalize on their automotive database? The case states that Experian’s automotive database “has raised the hackles of privacy advocates.” What legitimate privacy concerns and safeguard suggestions might be raised about this database and its use?
48
48 Real World Case 3 – Shell Exploration Why do companies still have problems with the quality of the data resources stored in their business information systems? What is a “data silo?”
49
49 Real World Case 3 (continued) How do data warehouse approaches help companies like Shell and OshKosh meet their data resource management challenges? What business benefits can companies derive from a data warehouse approach?
50
50 Real World Case 4 – BlueCross BlueShield & Warner Bros. What is a storage area network? Why are so many companies installing SANs? What are the reasons for the quick payback on SAN investments?
51
51 Real World Case 4 (continued) What are the challenges and alternatives to SANs as a data storage technology? What are some advantages of SANs?
52
52 Real World Case 5 – Sherwin-Williams & Krispy Kreme Tips for Managing External Data Purchase external data from a reliable source that will do most of the refining for you and will work with you on contingency plans. Run a test load first. A load of test data can pave the way for accurate production loads.
53
53 Real World Case 5 (continued) Managing external data (continued) Don’t collect data until business and IT staff have agreed on the amount, frequency, format, and content of the data you need. Don’t acquire more data or use more data sources than you really need.
54
54 Real World Case 5 (continued) Managing external data (continued) Don’t mingle external and homegrown data without adding unique identifiers to each record, in case you need to pull it out. Don’t overestimate the data’s integrity. Nothing beats direct customer contact and tactical details behind the data.
55
55 Real World Case 5 (continued) What challenges in acquiring and using data from external sources are identified in this case? Do you prefer the Sherwin-Williams or Krispy Kreme approach to acquiring external data?
56
56 Real World Case 5 (continued) What other sources of external data might a business use to gain valuable marketing and competitive intelligence?
57
57 CS 317 - Data Management and Information Processing
58
58 What Is a Database System? Database: a very large, integrated collection of data. Models a real-world enterprise Entities (e.g., teams, games) Relationships (e.g., The Forty-Niners are playing in The Superbowl) More recently, also includes active components, often called “business logic”. (e.g., the BCS ranking system) A Database Management System (DBMS) is a software system designed to store, manage, and facilitate access to databases.
59
59 Database Systems: Then
60
60 Database Systems: Today From Friendster.com on-line tour
61
61 Other Ways Databases Make Life Better? “Players could finally sign up for the Star Wars Galaxies game last week as Sony opened up registration to the public.” “Once players got in to the game they found that the game servers were offline because of database problems.” “Some players spent hours tuning their in-game characters only to find that crashes deleted all their hard work.” Source: BBC News Online, July 1, 2003.
62
62 Other databases you may use
63
63 Is the WWW a DBMS? Fairly sophisticated search available crawler indexes pages on the web Keyword-based search for pages But, currently data is mostly unstructured and untyped search only: can’t modify the data can’t get summaries, complex combinations of data few guarantees provided for freshness of data, consistency across data items, fault tolerance, … Web sites typically have a DBMS in the background to provide these functions. The picture is changing New standards e.g., XML, Semantic Web can help data modeling Research groups (e.g., at Berkeley) are working on providing some of this functionality across multiple web sites. =
64
64 “Search” vs. Query What if you wanted to find out which actors donated to John Kerry’s presidential campaign? Try “actors donated to john kerry” in your favorite search engine.
65
65 A “Database Query” Approach
66
66 Why Study Databases?? Shift from computation to information always true for corporate computing Web made this point for personal computing more and more true for scientific computing Need for DBMS has exploded in the last years Corporate: retail swipe/clickstreams, “customer relationship mgmt”, “supply chain mgmt”, “data warehouses”, etc. Scientific: digital libraries, Human Genome project, NASA Mission to Planet Earth, physical sensors, grid physics network DBMS encompasses much of CS in a practical discipline OS, languages, theory, AI, multimedia, logic Yet traditional focus on real-world apps ?
67
67 What’s the intellectual content? representing information data modeling languages and systems for querying data complex queries with real semantics* over massive data sets concurrency control for data manipulation controlling concurrent access ensuring transactional semantics reliable data storage maintain data semantics even if you pull the plug * semantics: the meaning or relationship of meanings of a sign or set of signs
68
68 Describing Data: Data Models A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using a given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.
69
69 Levels of Abstraction Views describe how users see the data. Conceptual schema defines logical structure Physical schema describes the files and indexes used. (sometimes called the ANSI/SPARC model) Physical Schema Conceptual Schema View 1View 2View 3 DB Users
70
70 Example: University Database Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) External Schema (View): Course_info(cid:string,enrollment:intege r) Physical schema: Relations stored as unordered files. Index on first column of Students. Physical Schema Conceptual Schema View 1View 2View 3 DB
71
71 Data Independence Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure of data. Physical data independence: Protection from changes in physical structure of data. Q: Why are these particularly important for DBMS? Physical Schema Conceptual Schema View 1View 2View 3 DB
72
72 Queries, Query Plans, and Operators System handles query plan generation & optimization; ensures correct execution. SELECT eid, ename, title FROM Emp E WHERE E.sal > $50K SELECT E.loc, AVG(E.sal) FROM Emp E GROUP BY E.loc HAVING Count(*) > 5 SELECT COUNT DISTINCT (E.eid) FROM Emp E, Proj P, Asgn A WHERE E.eid = A.eid AND P.pid = A.pid AND E.loc <> P.loc Issues: view reconciliation, operator ordering, physical operator choice, memory management, access path (index) use, … EmployeesProjectsAssignments Emp Select Emp Group(agg) HavingEmp Count distinct Asgn Join Join Proj
73
73 Concurrency Control Concurrent execution of user programs: key to good DBMS performance. Disk accesses frequent, pretty slow Keep the CPU working on several programs concurrently. Interleaving actions of different programs: trouble! e.g., account-transfer & print statement at same time DBMS ensures such problems don’t arise. Users/programmers can pretend they are using a single-user system. (called “Isolation”) Thank goodness! Don’t have to program “very, very carefully”.
74
74 Transactions: ACID Properties Key concept is a transaction: a sequence of database actions (reads/writes). DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact. Each transaction, executed completely, must take the DB between consistent states or must not run at all. DBMS ensures that concurrent transactions appear to run in isolation. DBMS ensures durability of committed Xacts even if system crashes. Note: can specify simple integrity constraints on the data. The DBMS enforces these. Beyond this, the DBMS does not understand the semantics of the data. Ensuring that a single transaction (run alone) preserves consistency is largely the user’s responsibility!
75
75 Ensuring Transaction Properites DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact. DBMS ensures durability of committed Xacts even if system crashes. Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of Xacts: Before a change is made to the database, the corresponding log entry is forced to a safe location. After a crash, the effects of partially executed transactions are undone using the log. Effects of committed transactions are redone using the log. trickier than it sounds!
76
76 The Log The following actions are recorded in the log: Ti writes an object: the old value and the new value. Log record must go to disk before the changed page! Ti commits/aborts: a log record indicating this action. Log is often duplexed and archived on “stable” storage. All log related activities (and in fact, all concurrency control related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS.
77
77 Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. Each database system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery
78
78 Advantages of a DBMS Data independence Efficient data access Data integrity & security Data administration Concurrent access, crash recovery Reduced application development time So why not use them always? Expensive/complicated to set up & maintain This cost & complexity must be offset by need General-purpose, not suited for special-purpose tasks (e.g. text search!)
79
79 …must understand how a DBMS works Databases make these folks happy... DBMS vendors, programmers Oracle, IBM, MS, Sybase, … End users in many fields Business, education, science, … DB application programmers Build enterprise applications on top of DBMSs Build web services that run off DBMSs Database administrators (DBAs) Design logical/physical schemas Handle security and authorization Data availability, crash recovery Database tuning as needs evolve
80
80 Summary (part 1) DBMS used to maintain, query large datasets. can manipulate data and exploit semantics Other benefits include: recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction provide data independence Key when dapp/dt << dplatform/dt
81
81 Summary, cont. DBAs, DB developers the bedrock of the information economy DBMS R&D represents a broad, fundamental branch of the science of computation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.