Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.

Slides:



Advertisements
Similar presentations
HDFS & MapReduce Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer.
Advertisements

A Fast Growing Market. Interesting New Players Lyzasoft.
Chapter 10 Site Architecture McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Database Administration Chapter FOSTER School of Business Acctg. 420.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Database Software File Management Systems Database Management Systems.
© 2011 Citrusleaf. All rights reserved.1 A Real-Time NoSQL DB That Preserves ACID Citrusleaf Srini V. Srinivasan Brian Bulkowski VLDB, 09/01/11.
Database Market By Ann Seidu, Keith McCoy, and Ty Christler.
1 Database Systems (Part I) Introduction to Databases I Overview  Objectives of this lecture.  History and Evolution of Databases.  Basic Terms in Database.
Introduction to Databases
1 Lecture 31 Introduction to Databases I Overview  Objectives of this lecture  History and Evolution of Databases  Basic Terms in Database and definitions.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Chapter 3 : Distributed Data Processing
5 Creating the Physical Model. Designing the Physical Model Phase IV: Defining the physical model.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
A Comparsion of Databases and Data Warehouses Name: Liliana Livorová Subject: Distributed Data Processing.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Banking Clouds V International Youth Banking Forum.
Page  1 SaaS – BUSINESS MODEL Debmalya Khan DEBMALYA KHAN.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
NCR CORPORATION Presented by: Dave Raspberry Cheng Murray-Khoo Eric Braun A Data Warehousing Solutions Provider A Data Warehousing Solutions Provider.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Database Systems – Data Warehousing
Systems analysis and design, 6th edition Dennis, wixom, and roth
Database Design – Lecture 16
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Introduction to Hadoop and HDFS
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
OnLine Analytical Processing (OLAP)
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
PowerPoint Presentation for Dennis, Wixom, & Tegarden Systems Analysis and Design with UML, 4th Edition Copyright © 2009 John Wiley & Sons, Inc. All rights.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data Warehousing An Overview. Outline What is Data Warehousing? (Definition) Why does anyone need it? (Applications) How is the data organized? (Star.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Cloud Computing Project By:Jessica, Fadiah, and Bill.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
SYS364 Database Design Continued. Database Design Definitions Initial ERD’s Normalization of data Final ERD’s Database Management Database Models File.
Align Business and Information Technology – with SOA Pradeep Nair Director – Software Group (IBM India/SA)
IBM Bluemix Ecosystem Development Hands on Workshop Section 1 - Overview.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
ORCALE CORPORATION:-Company profile Oracle Corporation was founded in the year 1977 and is the world’s largest s/w company and the leading supplier for.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Lec 5 part2 Disk Storage, Basic File Structures, and Hashing.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
1 Chapter The Impact of Database Customer centric approach - A highly personal approach Marketing databases are essential to the marketing process.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
IBM Systems and Technology Group © 2008 IBM Corporation Oracle Exadata Storage and the HP Oracle Database Machine Competitive Seller Podcast Mark Wulf.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Performance, Scalability & Benchmarking of mySAP.com
Open Source distributed document DB for an enterprise
Creating an Oracle Database
Storage Virtualization
Mapping the Data Warehouse to a Multiprocessor Architecture
Basic Concepts in Data Management
Physical Database Design
Fundamentals of Databases
Chapter 17 Designing Databases
Information Systems & Business Strategy
Presentation transcript:

Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose

Agenda Acxiom Overview Data warehouses Transactional databases Hybrid databases What’s new/future innovations Summary Questions and answers

Acxiom Overview At Acxiom, we create and deliver Customer and Information Management Solutions that enable many of the largest, most respected companies in the world to build great relationships with their customers. Acxiom achieves this by blending data, technology and services to provide the most advanced customer information infrastructure available in the marketplace today.

Acxiom Overview Acxiom customizes industry-specific solutions to solve the unique business issues of the Automotive, Financial Services, Government Services, Healthcare, Insurance, Media, Retail, Technology, Telecommunications, as well as Travel and Leisure industries. Every solution that Acxiom offers is built from our core competencies: CDI/Technology Data Database Consulting and Analytics Privacy Leadership IT Outsourcing

Acxiom Overview Customer and Information Management Solutions for marketing, risk and IT help companies: Improve acquisition, retention, cross sell, up sell and channel management Improve authorization, increase collections and reduce fraud Increase operational efficiencies and improve end- user satisfaction

Data Warehouses The characteristics of an Acxiom data warehouse generally are... Large multi-terabyte databases Large periodic sequential data loads Denormalized database schema Sequential reads/full table scans Little or no indices Little or no transaction logging Robust periodic backup solutions Performance measured using megabytes/gigabytes per second (MBPS, GBPS)

IBM Database Data Warehouses The processing platform is generally a large global class server or cluster of servers running UNIX. The storage sub- system is very fast with wide bandwidth and high levels of redundancy which permits the ability to move large amounts of sequential data in a very short time. The database is; A large vertical database that is denormalized with few tables but very long with sorted data and are sometimes several billion rows. The data is striped across the storage in a manner that prevents physical hot spots and takes advantage of the wide bandwidth.

IBM Data Warehouses

Transactional Databases The characteristics of an Acxiom transactional database generally are... Small, usually no larger than a few terabytes Random and simultaneous inserts, updates, deletes, and queries Random reads and writes Normalized database schema Transaction logging and archiving with incremental and periodic backup solutions Generally sub-second response required per transaction taking into account concurrency Performance measured using transactions per second (TPS) and I/O latency

IBM Database Transactional Databases The processing platform is generally a medium/enterprise class server The storage sub- system is very fast with low latency and nominal bandwidth and high levels of redundancy which permits the ability to move small amounts of selected data quickly. The database is; A normalized database that utilizes lookup tables. The data is stored randomly within a table but striped across the storage to prevent physical hot spots.

Transactional Databases IBM

Hybrid Databases The characteristics of an Acxiom hybrid database generally are... Medium sized, usually three to ten terabytes Random and simultaneous inserts, updates, deletes, and queries Random and sequential reads and writes Loosely normalized database schema Indices used sparingly Usually a batch maintenance process Transaction logging and archiving with incremental and periodic backup solutions Generally sub-second response required per transaction taking into account concurrency Performance measured using TPS, I/O latency, and MBPS

IBM Database Hybrid Databases The processing platform is generally a medium sized global class server The storage sub- system is very fast with wide bandwidth and high levels of redundancy which permits the ability to move large amounts of random and sequential data in a very short time. The database is; A large vertical database that is loosely normalized with few tables but very long with sorted data and are sometimes more than a billions rows. The data is striped across the storage in a manner that prevents physical hot spots and takes advantage of the wide bandwidth.

IBM Hybrid Databases

What’s New/ Future Innovations Grid or scale-out environments... Utilize low cost commodity based servers Low cost/no cost operating systems Many servers can be working on one problem with the aggregate processing power being more that one large server for less money Not locked into a single vendor or supplier When adding a new node, able to use current technology at a lower price Need to understand and factor in peripheral costs such as network, administration, data center etc.

Parallel Grid DB OS DB IBM server pSeries IBM server pSeries IBM server pSeries IBM server pSeries IBM server pSeries IBM server pSeries DB OS Clustered Grid

Shared nothing environment, each partition has its own resources allowing unlimited scalability (up to 999 partitions). Centralized management of partitioned environment. Data is equally distributed across all partitions. Any partition can receive connections and distribute queries among the other nodes.. Distributed Grid Database

Summary Understand the process in which the database is to be used and fashion a solution to meet the requirements and customer expectations Even though a DBA may only be responsible for the database, many factors such as operating system and hardware configuration affect the functionality of the database and thus are a concern to the DBA. A DBA must relate the database to its environment to achieve an optimized solution. A large multi-terabyte database is not a scary monster, it is the same as dealing with a smaller database, just add a few more zeros.

Questions?