“Add Derived Data to Your DBMS [performance tuning] Strategy” Group 3 Andrew Hall Zihong Huang Relationship to our course: Performance tuning is the focus for weeks 2-6 We learned many tricks in chapters 17, 18, 19. Derived data is another trick commonly used in Data Warehouses!
●So why would derived data be needed in a DBMS? o Performance (think of materialized views) o Quick responses Motivation Citation: icon.jpg
Types of Derived Data ●Aggregates ●Text analytics ●Calculated scores ●ETL (extract, transform and load) ●Adjusted data We’ll talk about just this one
Aggregation: Materialized Views CREATE TABLE country ( name char(50), year char(4), population decimal(11), primary key (name,year) ); SELECT name, AVG(population) FROM country GROUP BY name; CREATE VIEW Pop_View as SELECT name, AVG(population) average_population FROM country GROUP BY name; Traditional selection with aggregates Pre-computed aggregation via views SELECT * FROM Pop_View; Better performance if view is materialized!
Aggregates Examples ●Course Registration o The available seats in a class ●Number of patients prescribed blood-thinning drugs ●Amount of Kemps milk sold at Cub Foods each month ●Total number of flights and the average percentage of filled seats in those flights
Companies/Products Supporting Materialized Views Oracle PostgreSQL IBM DB2 (materialized query tables) Microsoft SQL Server (indexed views)
Questions?
References 1.Monash, Curt. “Add Derived Data To Your DBMS Strategy.” InformationWeek. N.p., n.d. Web. 10 Feb Web. 10 Feb “Materialized View.” Wikipedia, the free encyclopedia 31 Jan Wikipedia. Web. 11 Feb