Presentation is loading. Please wait.

Presentation is loading. Please wait.

Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313.

Similar presentations


Presentation on theme: "Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313."— Presentation transcript:

1 Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313

2

3 demo Columnstores speed up queries

4

5 Overview of Columnstore Index

6 6 … C1 C2 C3 C5C6C4

7 7 Segments C1 C2 C3 C5C6C4 Row group

8 OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount 20101107106011630.00 20101107103042117.00 20101107109042220.00 20101107103032117.00 20101107106053420.00 20101108106021525.00 20101108102021114.00 20101108106032525.00 20101108109011110.00 20101109106042420.00 20101109106042525.00 20101109103011117.00

9 OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount 20101107106011630.00 20101107103042117.00 20101107109042220.00 20101107103032117.00 20101107106053420.00 20101108106021525.00 OrderDateKeyProductKeyStoreKeyRegionKeyQuantitySalesAmount 20101108102021114.00 20101108106032525.00 20101108109011110.00 20101109106042420.00 20101109106042525.00 20101109103011117.00

10 OrderDateKey 20101107 20101108 ProductKey 106 103 109 103 106 StoreKey 01 04 03 05 02 RegionKey 1 2 2 2 3 1 Quantity 6 1 2 1 4 5 SalesAmount 30.00 17.00 20.00 17.00 20.00 25.00 OrderDateKey 20101108 20101109 ProductKey 102 106 109 106 103 StoreKey 02 03 01 04 01 RegionKey 1 2 1 2 2 1 Quantity 1 5 1 4 5 1 SalesAmount 14.00 25.00 10.00 20.00 25.00 17.00

11 OrderDateKey 20101107 20101108 ProductKey 106 103 109 103 106 StoreKey 01 04 03 05 02 RegionKey 1 2 2 2 3 1 Quantity 6 1 2 1 4 5 SalesAmount 30.00 17.00 20.00 17.00 20.00 25.00 OrderDateKey 20101108 20101109 ProductKey 102 106 109 106 103 StoreKey 02 03 01 04 01 RegionKey 1 2 1 2 2 1 Quantity 1 5 1 4 5 1 SalesAmount 14.00 25.00 10.00 20.00 25.00 17.00

12 StoreKey 01 04 03 05 02 StoreKey 02 03 01 04 01 RegionKey 1 2 2 2 3 1 1 2 1 2 2 1 Quantity 6 1 2 1 4 5 1 5 1 4 5 1 OrderDateKey 20101107 20101108 OrderDateKey 20101108 20101109 ProductKey 106 103 109 103 106 ProductKey 102 106 109 106 103 SalesAmount 30.00 17.00 20.00 17.00 20.00 25.00 SalesAmount 14.00 25.00 10.00 20.00 25.00 17.00

13 StoreKey 01 04 03 05 02 StoreKey 02 03 01 04 01 RegionKey 1 2 2 2 3 1 1 2 1 2 2 1 Quantity 6 1 2 1 4 5 1 5 1 4 5 1 OrderDateKey 20101107 20101108 OrderDateKey 20101108 20101109 ProductKey 106 103 109 103 106 ProductKey 102 106 109 106 103 SalesAmount 30.00 17.00 20.00 17.00 20.00 25.00 SalesAmount 14.00 25.00 10.00 20.00 25.00 17.00

14

15 15 bitmap of qualifying rows Column vectors Batch object

16 Make sure most of the work of the query happens in batch mode

17 Loading Columnstores Effectively

18

19

20

21

22

23 Optimizing database and index design

24

25

26

27 DateLicenseNumMeasure 20120301XYZ123100 20120302ABC777200 DateLicenseIdMeasure 201203011100 201203022200 LicenseIdLicenseNum 1XYZ123 2ABC777

28 Optimizing queries

29

30

31

32 Common workarounds

33 demo Example need for a workaround

34

35 Make sure most of the work of the query happens in batch mode

36 select m.Title, COUNT(p.IP) PurchaseCount from Media m left outer join Purchase p on p.MediaId=m.MediaId group by m.Title order by COUNT(p.IP) desc with T (Title, PurchaseCount) as ( select m.Title, COUNT(p.IP) PurchaseCount from Media m join Purchase p on p.MediaId=m.MediaId group by m.Title ) select distinct m.Title, ISNULL(T.PurchaseCount,0) as PurchaseCount from Media m left outer join T on m.Title=T.Title order by ISNULL(T.PurchaseCount,0) desc; 6.4 sec elapsed 55 CPU-seconds 0.2 sec elapsed 1.9 CPU-sec

37 select p.Date, count(*) from Purchase p where p.MediaId in (select MediaId from MediaStudyGroup) group by p.Date order by p.Date; --or-- select p.Date, count(*) from Purchase p where exists (select m.MediaId from MediaStudyGroup m where m.MediaId = p.MediaId) group by p.Date order by p.Date; select p.Date, count(*) from Purchase p join MediaStudyGroup m on p.MediaId = m.MediaId group by p.Date order by p.Date; 3.0 sec elapsed 32 CPU-seconds 0.05 sec elapsed 0.3 CPU-seconds

38 create view vPurchase as select * from Purchase union all select * from DeltaPurchase; select p.date, d.DayNumOfMonth, count(*) from vPurchase as p, Date d where p.Date = d.DateId group by p.date, d.DayNumOfMonth; select p.date, d.DayNumOfMonth, m.Genre, count(*) from vPurchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre Batch mode 0.1 sec elapsed Row mode 19 sec elapsed

39 with MainSummary (date, DayNumOfmonth, Genre, c) as ( select p.date, d.DayNumOfMonth, m.Genre, count(*) c from Purchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre ), DeltaSummary (date, DayNumOfmonth, Genre, c) as ( select p.date, d.DayNumOfMonth, m.Genre, count(*) c from DeltaPurchase p, Date d, Media m where p.Date = d.DateId and m.MediaId = p.MediaId group by p.date, d.DayNumOfMonth, m.Genre ), CombinedSummary (date, DayNumOfMonth, Genre, c) as ( --union all across the output of the two queries select * from MainSummary UNION ALL select * from DeltaSummary ) --group by to aggregate the data. select t.date, t.DayNumOfmonth, t.Genre, sum(c) as c from CombinedSummary as t group by t.date, t.DayNumOfmonth, t.Genre; Batch mode 0.3 sec elapsed

40 select count(*) from Purchase with CountByDate (Date, c) as ( select Date, count(*) from Purchase group by Date ) select sum(c) from CountByDate; 1.0 sec elapsed 15 CPU-seconds 0.06 sec elapsed 0.3 CPU-seconds

41 select p.Date, count(distinct p.UserId) as UserIdCount, count(distinct p.MediaId) as MediaIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date; 26 sec elapsed 31 CPU-seconds

42 with DistinctMediaIds (Date, MediaIdCount) as ( select p.Date, count(distinct p.MediaId) as MediaIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date ), DistinctUserIds (Date, UserIdCount) as ( select p.Date, count(distinct p.UserId) as UserIdCount from Purchase p, Media m where p.MediaId = m.MediaId and m.Category in ('Horror') group by p.Date ) select m.Date, m.MediaIdCount, u.UserIdCount from DistinctMediaIds m join DistinctUserIds u on m.Date=u.Date 0.5 sec elapsed 6 CPU-seconds

43 Summary

44

45

46


Download ppt "Turbocharge your Data Warehouse Queries with Columnstore Indexes Len Wyatt Program Manager Microsoft Corporation DBI313."

Similar presentations


Ads by Google