Download presentation
Presentation is loading. Please wait.
1
Introduction to Computing Lecture # 13
2
Outline Managing Files: Basic Concepts Database Definition
Database Management Systems Database Models Data Mining
3
Managing Files: Basic Concepts
Data storage hierarchy - levels of data stored in a computer: Database Files Records Fields Characters (bytes) Bits
4
Managing Files: Basic Concepts
Database – an organized collection of integrated files. File – a collection of related records. Record – a collection of related fields. Often called a row. Field – a unit (individual piece) of data consisting of one or more characters (bytes). Often called a column. Character (byte) – a letter, number, or special character. Key field – a field that is chosen to uniquely identify a record so that it can be easily retrieved and processed.
5
Managing Files: Basic Concepts
Field Name Record Field
6
Database Definition Structured set of data held in a computer. (Pocket Oxford Dictionary) An organized collection of related (integrated) files. (Williams and Sawyer) A database is a collection of related data or facts. (Peter Norton)
7
Database Management Systems
Database management system (DBMS) – programs that control the structure of a database and access to the data. (Williams and Sawyer) DBMS is a collection of programs that control the database. (Peter Norton) Advantages of DBMSes: File sharing Reduced data redundancy Data redundancy – situation in which the same data fields appear in many different files and often in different formats. Improved data integrity Data integrity – measure of how accurate, consistent, and up-to-date data is. Increased security
8
Database Models Just as files can be organized in different ways, so databases can be organized in ways to best fit their use. The four most common arrangements are: Hierarchical Network Relational Object-oriented
9
Database Models Hierarchical database – fields or records are arranged in related groups, resembling a family tree, with child (lower-level) records subordinate to parent (higher-level) records.
10
Database Models Network database – similar to a hierarchical database, but each child record can have more than one parent record.
11
Database Models Relational database – a database which relates (connects) data in different files through the use of a key field, or common data element.
12
Database Models SQL (Structured Query Language) – the standard language used to create, modify, maintain, and query relational databases. SQL is pronounced as “sequel.” How did this acronym get such an unlikely pronunciation? The first structured query language was developed by IBM in the 1970s; its product name was “Sequel2.” E. F. Codd is considered the “father” of relational database management systems – the most common model of databases. His article entitled “A Relational Model of Data for Large Shared Data Banks” was published in the June 1970 “Communications of the ACM.”
13
Database Models Object-oriented database – database which uses “objects” (software written in small, reusable chunks) as elements within database files An object consists of: Data in any form, and Instructions on the actions to take on the data
14
Survey of Database Systems
Databases for individuals Manage aspects of your life Organizes hobbies for school Microsoft Access is the most popular Common Corporate DBMS Oracle DB2 Microsoft SQL Server MySQL
15
Survey of Database Systems
Oracle Most popular enterprise-level DBMS Very flexible storage system Can be very complex Platform independent Offers a wide range of solutions DB2 Venerable IBM database Only database using pure SQL
16
Survey of Database Systems
Microsoft SQL Server Fastest growing DBMS Only runs on Microsoft platforms Eight different versions exist Extremely scalable architecture Software can grow with the data MySQL Leading DBMS for Linux Very inexpensive Features are those needed in business Often faster than other DBMS Platform independent An interesting article contrasting the strengths and weaknesses of these products can be found at
17
Data Mining Data mining (DM) – the computer-assisted process of sifting through and analyzing vast amounts of data in order to extract meaning and discover new knowledge. Searches for trends and patterns Makes predictions on events Supplies ideas for improving business Data mining begins with acquiring data and preparing it for what is known as the data warehouse by the following steps: Data sources Data fusion and cleansing Data and meta-data Data warehouse Sifting:The act of separating
18
Data Mining Data sources Data may come from a number of sources:
Point-of-sale transactions in flat files on mainframes; Databases of all kinds; Other, e.g., news articles, online articles, etc.; and Data from data warehouses
19
Data Mining Data fusion and cleansing
Data from diverse sources must be fused\join together, then put through a process known as data cleansing, or scrubbing. The data may be of poor quality, full of errors and inconsistencies Putting together the data from various sources and then “scrubbing” the data to eliminate errors and inconsistencies. Fusion=An occurrence that involves the production of a union Scrubbing=The act of cleaning
20
Data Mining Data and meta-data
Cleaned-up data and meta-data (data about data) The cleansing process yields both the cleaned-up data and a variation of it called meta-data. Meta-data shows the origins of the data, the transformations it has undergone, and summary information about it, which makes it more useful than the cleansed but unintegrated, unsummarized data.
21
Data Mining Data warehouse
A special database of cleaned up data and meta-data. Both the data and the meta-data are sent to the data warehouse.
22
Data Mining Some applications of data mining: Marketing: Health:
Marketers use data mining tools to mine point-of-sale databases of retail stores, which contains facts for thousands of products in hundreds of geographic areas. By understanding customer preferences and buying patterns, marketers hope to target consumers’ individual needs. Health: A coach in the U.S. Gymnastics Federation used a data mining system called IDIS to discover what long-term factors contributed to athletes’ performance, so as to know what problems to treat early on.
23
End Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.