Download presentation
Presentation is loading. Please wait.
1
Informix Formation Chetana Mehta (chetana@pspl.co.in) PSPL, Pune
2
Outline b Overview of Formation b PSPL’s role b Future work
3
Data Warehouse Architecture Data Warehouse Engine Optimized Loader Extraction Cleansing Analyze Query Metadata Repository Relational Databases Legacy Data Purchased Data ERP Systems
4
What is ETL? b Extract data from existing operational and legacy data, transform and load the warehouse. b Issues: Sources of data for the warehouseSources of data for the warehouse Data quality at the sourcesData quality at the sources Merging different data sourcesMerging different data sources Data TransformationData Transformation How to propagate updates (on the sources) to the warehouseHow to propagate updates (on the sources) to the warehouse Terabytes of data to be loadedTerabytes of data to be loaded
5
Overview of Formation b ETL Tool b User-friendly b Scalable
6
Operators b Join - Hash, Non-equi, Nested loop, Sort- merge b Aggregate/GroupBy b Sort b Deduplicate b Surrogate Key
7
Performance Subsystem b Periodic statistics b Summary statistics Operator summaryOperator summary Group summaryGroup summary Performance hintsPerformance hints
8
Periodic Statistics b No. of records pushed/pulled b Memory used b Disk reads/writes b Temporary space used
9
Summary Statistics b No. of records pulled/pushed b Record size b Time when first/last record sent/received b No. of unique keys/groups b Ratio of output size to input size b Selectivity
10
Performance Hints b Ideal memory size b Suggested memory size b Parallelizing
11
Future work b Memory cognizant optimization b Parametric query optimization b Operator ordering b XML extensions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.