Information and Security Analytics Lecture #1 Unit #1: Data Management: Overview Dr. Bhavani Thuraisingham May 27, 2010.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

C6 Databases.
Managing Data Resources
Databases Chapter Distinguish between the physical and logical view of data Describe how data is organized: characters, fields, records, tables,
Chapter 3 Database Management
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
File Systems and Databases
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Chapter 2 Database Environment. Agenda Three-Level ANSI-SPARC Architecture Database Languages Data Models Functions of DBMS Components of DBMS Teleprocessing.
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Chapter 14 The Second Component: The Database.
Chapter 2 Database Environment Pearson Education © 2014.
The University of Akron Dept of Business Technology Computer Information Systems Database Management Approaches 2440: 180 Database Concepts Instructor:
Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Semantic web technologies for secure interoperability and.
Introduction Chapter 1. Reference Book  Database Systems Thomas Connolly, Carolyn Begg, Anne Strachan Addison-Wesley 1999 ISBN:
Data Management Information Management Knowledge Management for Network Centric Operations Dr. Bhavani Thuraisingham The University of Texas at Dallas.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Chapter 2 CIS Sungchul Hong
Chapter 2 Database Environment
Web-Enabled Decision Support Systems
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
Database System Concepts and Architecture
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Introduction to Data, Information and Knowledge Management Dr. Bhavani Thuraisingham The University of Texas at Dallas Data, Information and Knowledge.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Data resource management
1 Chapter 1 Introduction to Databases Transparencies.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Security for Distributed Data Management.
3/6: Data Management, pt. 2 Refresh your memory Relational Data Model
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #17 Data Warehousing, Data.
Chapter 2 Database Environment.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Supporting Technologies.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #12 Secure Object Systems.
IIS 645 Database Management Systems DDr. Khorsheed Today’s Topics 1. Course Overview 22. Introduction to Database management 33. Components of Database.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #2 Supporting Technologies:
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #11 Secure Heterogeneous.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
Managing Data Resources File Organization and databases for business information systems.
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Database Management:.
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Introduction to Data, Information and Knowledge Management
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
MANAGING DATA RESOURCES
Analyzing and Securing Social Networks
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Presentation transcript:

Information and Security Analytics Lecture #1 Unit #1: Data Management: Overview Dr. Bhavani Thuraisingham May 27, 2010

1-2 12/6/ :19 Objective of the Unit 0 This unit provides an overview of the developments in data management. It also provides an overview of data management, information management and knowledge management and illustrates a framework 0 Reference: Data Management Systems: Evolution and Interoperation, Thuraisingham, CRC Press, 1997

1-3 12/6/ :19 Outline of the Unit 0 What is Data Management? 0 Developments in Data Management 0 Current Status and Trends 0 Note on Data Administration 0 Data management, Information management, and Knowledge Management

1-4 12/6/ :19 What is data management 0 One proposal: Data Management = Database System Management + Data Administration 0 Includes data analysis, data administration, database administration, auditing, data modeling, database system development, database application development 0 The tutorial will focus mainly on database system aspects of data management

1-5 12/6/ :19 Developments in Database Systems Network, Hierarchical database systems Relational database systems, transaction processing, distributed database systems Heterogeneous database integration, Migrating legacy databases Next generation database systems: object-oriented, deductive, data warehousing, data mining, multimedia database systems, Internet database

1-6 12/6/ :19 Current Status Database Systems Multimedia Database Systems Data Warehousing Systems Limited integration between the different types of systems Data Mining Systems Sensor Database Systems Heterogeneous Database Systems Often Stovepiped by Technology

1-7 12/6/ :19 Vision for Database Management

1-8 12/6/ :19 Some Outstanding Problems Heterogeneous Database Integration Multimedia Database Management Real-time Database Management Integration with other Technologies Semantic heterogeneity Inferencing Transaction processing Integrity Security Data model Index strategies Synchronization Data manipulation Quality of service Operating system services Transaction processing Active databases Distributed processing Mass storage Information management Knowledge management Migrating Legacy Applications Modernization Enterprise modeling Schema transformation Integration

1-9 12/6/ :19 Some Current Trends in Data Management 0 Heterogeneous database integration -Query, transactions, semantics, security and integrity 0 Migrating legacy databases -Fine-grained encapsulation, distributed objects 0 Multimedia databases -Query, model, quality-of-service, index 0 Data Warehousing -Building a warehouse, query 0 Data Mining -Multimedia databases, web data mining 0 Data management for collaboration -Architecture, transactions 0 Web databases and digital libraries -Query, transactions, index, security

/6/ :19 Interoperability of Heterogeneous Database Systems

/6/ :19 Note on Data Administration 0 Identifying the data -Data may be in files, paper, databases, etc. 0 Analyzing the data -Is the data of good quality? -Is the data complete? 0 Data standardization -Should one standardize all the data elements and metadata? -Repositories for handling semantic heterogeneity? 0 Data modeling -Structure the data, model the data and the processes

/6/ :19 Data, Information and Knowledge Management 0 Data Management -Data: stored in databases, files or some media -Data management includes modeling, storing, retrieving and anbalyzing the data 0 Information Management -Information is what is obtained by making sense out of the data; E.g., Data with context -Information management is about modeling, storing, retrieving and analyzing the information 0 Knowledge Management -Knowledge is what is obtained when the information is understood; it enables one to take actions -Knowledge management is about utilizing the knowledge to improve the business of an organization

/6/ :19 Data, Information and Knowledge Management: Alternative View: MITRE Model 1999/2000 Communication, Network, Operating System, Middleware Data Management Information Management Knowledge Management Decision Support

/6/ :19 Information and Security Analytics Lecture #1 Unit #2: Database Systems Dr. Bhavani Thuraisingham May 27, 2010

/6/ :19 Objective of the Unit 0 This unit will provide an overview of the concepts and developments in database systems 0 Reference: Data Management Systems: Evolution and Interoperation, Thuraisingham, CRC Press, 1997

/6/ :19 Outline of the Unit 0 Concepts in database systems 0 Types of database systems

/6/ :19 Concepts in Database Systems 0 Definition of a Database system 0 Early systems 0 Metadata 0 Architectural Issues -Schema, Functional 0 DBMS Design Issues 0 Other Issues -Database design, Administration

/6/ :19 Database System 0 Consists of database, hardware, Database Management System (DBMS), and users 0 Database is the repository for persistent data 0 Hardware consists of secondary storage volumes, processors, and main memory 0 DBMS handles all users’ access to the database 0 Users include application programmers, end users, and the Database Administrator (DBA) 0 Need: Reduced redundancy, avoids inconsistency, ability to share data, enforce standards, apply security restrictions, maintain integrity, balance conflicting requirements 0 We have used the definition of a database management system given in C. J. Date’s Book (Addison Wesley, 1990)

/6/ :19 An Example Database System Adapted from C. J. Date, Addison Wesley, 1990

/6/ :19 Early systems: Hierarchical and Network Database Systems Hierarchical Data Model SUPPLIERS SUPPLIES PARTS SUPPLIES SUPPLIERS SUPPLIES PARTS Network Data Model

/6/ :19 Metadata 0 Metadata describes the data in the database -Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary 0 Metadatabase stores the metadata -Could be physically stored with the database 0 Metadatabase may also store constraints and administrative information 0 Metadata is also referred to as the schema or data dictionary

/6/ :19 Three-level Schema Architecture: Details External Schema A External Schema B Conceptual Schema Internal Schema User A1 User A2User A3User B1 User B2 External Model A External Model B Conceptual Model Stored Database Internal Model External/Conceptual Mapping B External/Conceptual Mapping A Conceptual/Internal Mapping

/6/ :19 Functional Architecture User Interface Manager Query Manager Transaction Manager Schema (Data Dictionary) Manager (metadata) Security/ Integrity Manager File Manager Disk Manager Data Management Storage Management

/6/ :19 DBMS Design Issues 0 Query Processing -Optimization techniques 0 Transaction Management -Techniques for concurrency control and recovery 0 Metadata Management -Techniques for querying and updating the metadatabase 0 Security/Integrity Maintenance -Techniques for processing integrity constraints and enforcing access control rules 0 Storage management -Access methods and index strategies for efficient access to the database

/6/ :19 Other Issues 0 Database design -Generally a two-step process =Semantic data model to capture the entities of the application and the relationships between the entities =Generate the conceptual schema; theory of normal forms for relational databases -Research on object-oriented approaches for database design 0 Database Administration -Creating and deleting databases; backup and recovery, enforcing policies, auditing, etc.

/6/ :19 Types of Database Systems 0 Relational Database Systems 0 Object Database Systems 0 Deductive Database Systems 0 Other -Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.

/6/ :19 Relational Database: Informal Overview 0 Collection of tables also called relations 0 Table has one or more columns also called attributes 0 Each table has zero or more rows also called tuples 0 Elements of a row take values from a pool of legal values 0 The values of one or more columns in a row uniquely identify the row. These columns form an identifier (also called key) 0 One identifier is designated as the unique identifier (also called primary key) 0 Querying relational databases using language called SQL (Structured Query Language)

/6/ :19 Relational Database: Example Relation S: S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens Relation P: P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London Relation SP: S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400

/6/ :19 SQL: Data Manipulation 0 Select, Update, Delete, Insert Examples: SELECT S.S#, S.STATUS FROM S WHERE S.CITY = Paris SELECT * FROM S SELECT S.*, P.* FROM S, P WHERE S.CITY = P.CITY UPDATE P SET COLOR = ‘Yellow’ WEIGHT = WEIGHT + 5 CITY = NULL WHERE P# = P2

/6/ :19 Features of Object-Oriented Database Systems Suitable for Advanced Applications 0 Objects (support for large and variable sized data blocks) 0 Class hierarchy (reusability) 0 Instance variables, composite and complex objects (complex data structures) 0 Methods, and message passing (object encapsulation) 0 Pointer swizzling (performance) 0 Tighter integration with programming languages (application program support) 0 Special mechanisms for long transactions and concurrency control, multimedia information management, schema management, versions management, storage management

/6/ :19 Concepts in Object Database Systems 0 Objects- every entity is an object -Example: Book, Film, Employee, Car 0 Class -Objects with common attributes are grouped into a class 0 Attributes or Instance Variables -Properties of an object class inherited by the object instances 0 Class Hierarchy -Parent-Child class hierarchy 0 Composite objects -Book object with paragraphs, sections etc. 0 Methods -Functions associated with a class

/6/ :19 Example Class Hierarchy Document Class D1 D2 Book Subclass B1 # of Chapters Volume # Print-doc-att(ID) Method1 : Journal Subclass J1 Print-doc (ID) Method2: ID Name Author Publisher

/6/ :19 Example Composite Object Composite Document Object Section 1 Object Section 2 Object Paragraph 1 Object Paragraph 2 Object

/6/ :19 Deductive Database Systems 0 Database systems augmented with inference engines to deduce new data from existing data and rules 0 Example -Rule: parent of a parent is a grandparent -Data: John is Jane’s parent; Jane is Robert’s parent -From the above, infer John is Robert’s grandparent 0 Loose and tight coupling architectures between the database system and inference engine

/6/ :19 Current Status 0 Database Systems is a mature technology; numerous products and prototypes 0 Much work followed in distributed and heterogeneous databases 0 Current directions include web database management as well as data management support for novel applications including E-commerce, Bioinformatics and Geoinformatics 0 Work still continues on developing new kinds of database systems including stream/sensor database systems

/6/ :19 Information and Security Analytics Lecture #1 Unit #3: Distributed and Heterogeneous Database Systems Dr. Bhavani Thuraisingham May 27, 2010

/6/ :19 Objective of the Unit 0 This unit provides an overview of concepts in distributed and heterogeneous databases. In particular, definitions and functions, are discussed 0 Reference: -Data Management Systems: Evolution and Interoperation, Thuraisingham, CRC Press, Heterogeneous Information Exchange and Organizational Hubs, Kluwer, 2002, Editors: Bestougeff, Dubois and Thuraisingham

/6/ :19 Outline of the Unit 0 Distributed Database Systems -Architecture, Data Distribution, Functions 0 Heterogeneous Database Integration 0 Federated Database Management 0 Client-Server Database Management 0 Migrating Legacy Databases 0 Current Status and Directions

/6/ :19 A Definition of a Distributed Database System 0 A collection of database systems connected via a network 0 The software that is responsible for interconnection is a Distributed Database Management System (DDBMS) 0 Each DBMS executes local applications and should be involved in at least one global application (Ceri and Pelagetti) 0 Homogeneous environment

/6/ :19 Architecture Communication Network Distributed Processor 1 DBMS 1 Data- base 1 Data- base 3 Data- base 2 DBMS 2 DBMS 3 Distributed Processor 2 Distributed Processor 3 Site 1 Site 2 Site 3

/6/ :19 Distributed Processor Distributed Query/Update Processor Distributed Transaction Manager Distributed Metadata Management Network Interface Local DBMS Interface Integrity/ Security Manager

/6/ :19 Data Distribution EMP1 SS#NameSalary 1John20 2Paul30 3James40 4Jill Mary 6Jane70 D# DnameD#MGR Jane David Peter DEPT1 SITE 1 SITE 2 EMP2 SS#NameSalary 9Mathew 70 D# 50 Dname D#MGR 50 Math John Physics DEPT2 David Peter C. Sci. English French 20 Paul

/6/ :19 Distributed Database Functions 0 Distributed Query Processing -Optimization techniques across the databases 0 Distributed Transaction Management -Techniques for distributed concurrency control and recovery 0 Distributed Metadata Management -Techniques for managing the distributed metadata 0 Distributed Security/Integrity Maintenance -Techniques for processing integrity constraints and enforcing access control rules across the databases

/6/ :19 DBMS 1 DQP DBMS 2 DQP DBMS 3 EMP1 (20) EMP2 (30) DEPT2 (20) EMP1 (20) EMP3 (50) DEPT3 (30) Network Query at site 1: Join EMP and DEPT on D# Move EMP2 to site 3; Merge EMP1, EMP2, EMP3 to form EMP Move DEPT2 to site 3; Merge DEPT2 and DEPT3 to form DEPT Join EMP and DEPT; Move result to site 1 Query Processing Example (Concluded) DQP (Distributed Query Processor)

/6/ :19 Transaction Processing Example Site 1 Coordinator Transaction Tj Site 2 Participant Site 3 Participant Site 4 Participant Subtransaction Tj2 Subtransaction Tj3 Subtransaction Tj4 Issues: Concurrency control Recovery Data Replication Two-phase commit: Coordinator queries participants whether they are ready to commit If all participants agree, then coordinator sends request for the participants to commit DTM (Distributed Transaction Manager) responsible for executing the distributed transaction

/6/ :19 Interoperability of Heterogeneous Database Systems Database System A Database System B Network Database System C (Legacy) Transparent access to heterogeneous databases - both users and application programs; Query, Transaction processing (Relational) (Object- Oriented)

/6/ :19 Technical Issues on the Interoperability of Heterogeneous Database Systems 0 Heterogeneity with respect to data models, schema, query processing, query languages, transaction management, semantics, integrity, and security policies 0 Interoperability based on client-server architectures 0 Federated database management -Collection of cooperating, autonomous, and possibly heterogeneous component database systems, each belonging to one or more federations

/6/ :19 Different Data Models Node A Node B Database Relational Model Network Model Node C Database Object- Oriented Model Network Node D Database Hierarchical Model Developments: Tools for interoperability; commercial products Challenges: Global data model

/6/ :19 Schema Integration and Transformation: An approach Schema describing the network database Schema describing the hierarchical database Schema describing the object-oriented database Global Schema: Integrate the generic schemas External Schema I External Schema II External Schema III Schema describing the relational database Generic schema describing the relational database Generic schema describing the network database Generic schema describing the hierarchical database Generic schema describing the object-oriented database Challenges: Selecting appropriate generic representation; maintaining consistency during transformations; schema evolution

/6/ :19 Semantic Heterogeneity 0 Semantic heterogeneity occurs when there is a disagreement about the meaning or interpretation of the same data Object O Node A Node B Database Object O interpreted as a passenger ship Object O interpreted as a submarine Challenges: Standard definitions; Repositories

/6/ :19 Federated Database Management Database System A Database System B Database System C Cooperating database systems yet maintaining some degree of autonomy Federation F1 Federation F2

/6/ :19 Autonomy Component A Component B Component C local request request from component communication through federation component A does not communicate with component C component A honors the local request first Challenges: Adapt techniques to handle autonomy - e.g., transaction processing, schema integration; transition research to products

/6/ :19 Schema Integration and Transformation in a Federated Environment Adapted from Sheth and Larson, ACM Computing Surveys, September 1990 Component Schema for Component A Component Schema for Component B Component Schema for Component C Local Schema 1 Local Schema 2 Generic Schema for Component A Generic Schema for Component B Generic Schema for Component C Export Schema for Component A Export Schema I for Component B Export Schema for Component C Federated Schema for FDS - 1 Federated Schema for FDS - 2 External Schema 1.2Schema 2.1 External Schema 2.2 External Schema 1.1 Export Schema II for Component B External

/6/ :19 Security Policy Integration

/6/ :19 Federated Data and Policy Management Export Data/Policy Component Data/Policy for Agency A Data/Policy for Federation Export Data/Policy Component Data/Policy for Agency C Component Data/Policy for Agency B Export Data/Policy

/6/ :19 Client-Server Architecture: Example Network Client from Vendor A Client from Vendor B Server from Vendor C Server from Vendor D Database

/6/ :19 Remote Database Access (RDA) Model RDA Service Provider RDA Client RDA Server Database RDA Client Interface RDA Server Interface Interface between client and service provider can operate in synchronous or asynchronous modes

/6/ :19 Example Three-Tier Architecture Client: User Interface Processing Server: Local DBMS Network Intermediate: Distributed Processor, Business Rules, Logic

/6/ :19 Object-based Interoperability Object Request Broker Client Object Server Object Example Object Request Broker: Object Management Group’s (OMG) CORBA (Common Object Request Broker Architecture)

/6/ :19 Javasoft’s RMI (Remote Method Invocation) RMI Business Objects Clients Java-based Servers

/6/ :19 Microsoft’s Open Database Connectivity DBMS Vendor B Microsoft’s ODBC DBMS Vendor A ODBC Driver for DBMS A Database A B ODBC Driver for DBMS B Microsoft Application C Microsoft Application D

/6/ :19 Overview: Migrating Legacy Systems 0 Many of the current systems and applications may become obsolete 0 Need an approach to migrate these systems to new architectures 0 Evolutionary approach: incremental transition of today's systems into more flexible systems 0 Extensible system architecture ultimately replaces today's hardware and software architecture 0 Open systems approach, standards

/6/ :19 Migrating Legacy Database and Applications 0 Build business model in a sub-domain and relate data to existing databases and systems. 0 Wrap existing systems to provide access as needed. 0 Incorporate middle tier services and begin migrating workflow. 0 Gradually migrate business logic and rely on business objects for end-user systems.

/6/ :19 Migrating Business Logic container middle tier business objects data entry Airspace Airspace2 Airspace3 Airspace4 Airspace5 timeturnpointsElevations xx,xx,xx nn:nn xx,xx,xx Etc.... visualization client tier blah,blah,blah,blah,blah,blah,blah, word processing existing databases Airspace Airspace2 Airspace3 Airspace4 Airspace5 timeturnpointsElevations xx,xx,xx nn:nn xx,xx,xx Etc.... server tier existing systems existing processes blah,blah,blah,blah,blah,blah,blah, EDI Artifacts distribution services CORBA business logic

/6/ :19 Data Sharing databases Airspace Airspace2 Airspace3 Airspace4 Airspace5 timeturnpointsElevations xx,xx,xx nn:nn xx,xx,xx Etc.... forms Airspace Airspace2 Airspace3 Airspace4 Airspace5 timeturnpointsElevations xx,xx,xx nn:nn xx,xx,xx Etc.... map display 4D display blah,blah,blah,blah,blah,blah,blah, data model middle tier data access client tier server tier existing systems mediation data access databases Airspace Airspace2 Airspace3 Airspace4 Airspace5 timeturnpointsElevations xx,xx,xx nn:nn xx,xx,xx Etc.... blah,blah,blah,blah,blah,blah,blah, data model

/6/ :19 Application vs. Database Migration 0 Extract schema from the legacy code -Use reengineering tools 0 Extract metadata associated with the data 0 Deal with incomplete data and fill in the gaps 0 Build schemas in the target system from the extracted schema 0 Build the database

/6/ :19 Example: Legacy Migration using Objects CTAPS - Contingency Theater Automated Planning System Application InterfacesDomain InterfacesCommon Facilities Object Services Object Request Broker Targetting Planning/ATO Collection Mgt... MCG&I Messaging Weather... User Interface Compound Data System & Task Mgt... Security Concurrency Transactions...

/6/ :19 Example Lessons Learned: Experience with CORBA 0 CORBA provides an evolvable system integration platform 0 CORBA provides a path for legacy migration -Applications can be coarsely wrapped as CORBA objects, providing 100% reuse =Wrapping is a relatively straight forward technique =Need to dig to uncover hidden dependencies =Does not address duplication of common functions -Applications can be reengineered to replace duplicated functions with CORBA based common services =Substantially more difficult than coarse wrapping

/6/ :19 Example: Migration using Object for Real-time Systems Technology provided by Project Hardware Display Processor & Refresh Channels Consoles (14) Navigation Sensors Data Links Data Analysis Programming Group (DAPG) Future App Future App Future App Multi-Sensor Tracks Sensor Detections Real Time Operating System MSI App Data Mgmt. Data Xchg. Infrastructure Services Interface to DAPG, etc., will be simulated for project demonstration

/6/ :19 Current Status and Directions 0 Developments -Several prototypes and some commercial products -Tools for schema integration and transformation -Standards for interoperable database systems 0 Challenges being addressed -Semantic heterogeneity -Autonomy and federation -Global transaction management -Integrity and Security 0 New challenges -Scale -Web data management

/6/ :19 Information and Security Analytics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #1 Unit #4 Data Warehousing May 28, 2010

/6/ :19 Outline 0 Data Warehousing 0 Data Warehouse to Data Mining

/6/ :19 What is a Data Warehouse? 0 A Data Warehouse is a: -Subject-oriented -Integrated -Nonvolatile -Time variant -Collection of data in support of management’s decisions -From: Building the Data Warehouse by W. H. Inmon, John Wiley and Sons 0 Integration of heterogeneous data sources into a repository 0 Summary reports, aggregate functions, etc.

/6/ :19 Example Data Warehouse Oracle DBMS for Employees Sybase DBMS for Projects Informix DBMS for Medical Data Warehouse: Data correlating Employees With Medical Benefits and Projects Could be any DBMS; Usually based on the relational data model Users Query the Warehouse

/6/ :19 Some Data Warehousing Technologies 0 Heterogeneous Database Integration 0 Statistical Databases 0 Data Modeling 0 Metadata 0 Access Methods and Indexing 0 Language Interface 0 Database Administration 0 Parallel Database Management

/6/ :19 Data Warehouse Design 0 Appropriate Data Model is key to designing the Warehouse 0 Higher Level Model in stages -Stage 1: Corporate data model -Stage 2: Enterprise data model -Stage 3: Warehouse data model 0 Middle-level data model -A model for possibly for each subject area in the higher level model 0 Physical data model -Include features such as keys in the middle-level model 0 Need to determine appropriate levels of granularity of data in order to build a good data warehouse

/6/ :19 Distributing the Data Warehouse 0 Issues similar to distributed database systems Distributed Warehouse Central Bank Branch ABranch B Central Warehouse Central Bank Branch A Branch B Central Warehouse Branch B Warehouse Branch A Warehouse Non-distributed Warehouse

/6/ :19 Multidimensional Data Model

/6/ :19 Indexing for Data Warehousing 0 Bit-Maps 0 Multi-level indexing 0 Storing parts or all of the index files in main memory 0 Dynamic indexing

/6/ :19 Metadata Mappings

/6/ :19 Data Mining Knowledge Mining Knowledge Discovery in Databases Data Archaeology Data Dredging Database Mining Knowledge Extraction Data Pattern Processing Information Harvesting Siftware The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques (Thuraisingham 1998)