Joins and Cardinality Demystified Elizabeth Snow-Trenkle Rocky Mountain Cognos User Group Meeting May 17, 2013
Topics Introduction to Joins 1..1 “The Inner Join” 0..1 “The Outer Join” Classifying the “Fact” Table 1..n “The Fact Table” 0..n “The Outer Fact Table” Considerations
Introduction to Joins What is cardinality within Cognos? Relationship between tables: Traditionally, inner and outer joins Cognos introduces Fact Detection
Consider the following tables: Underlying Tables Consider the following tables:
Requirement/Goal This report shows Patient Id with associated Charges and Payments.
1..1 “The Inner Join” Returns rows when there is at least one match in both tables. REG to REG_DISTRICT
1..1 “The Inner Join” Expected Actual Notice missing patients and totals are incorrect Actual
Will count 2x payments because there are 2 charges. 1..1 “The Inner Join” Examine the SQL select Patient.PatientId as PatientId, XSUM(charge.ChargeAmt for Patient.PatientId ) as ChargeAmt, XSUM(payments.PaymentAmt for Patient.PatientId ) as PaymentAmt from Test.Test.dbo.Patient Patient, Test.Test.dbo.charge charge, Test.Test.dbo.payments payments where (Patient.PatientId = charge.PatientId) and (Patient.PatientId = payments.PatientId) group by Patient.PatientId Dr Will count 2x payments because there are 2 charges.
0..1 “The Outer Join” Returns all rows from the left table in conjunction with matching rows from the right table. If there are no matching columns in the right table, the outer join returns NULL values.
0..1 “The Outer Join” Expected Actual
Will count 2x payments because there are 2 charges. 0..1 “The Outer Join” Examine the SQL select Patient.PatientId as PatientId, XSUM(charge.ChargeAmt for Patient.PatientId ) as ChargeAmt, XSUM(payments.PaymentAmt for Patient.PatientId ) as PaymentAmt from Test.Test.dbo.Patient Patient left outer join Test.Test.dbo.charge charge on (Patient.PatientId = charge.PatientId) Test.Test.dbo.payments payments on (Patient.PatientId = payments.PatientId) group by Patient.PatientId Will count 2x payments because there are 2 charges.
Classify the “Fact” Table Center of a star schema Only facts and keys; attributes come from dimension tables Multi-Fact queries are possible, but require a conformed dimension. In this case, Patient is the conformed dimension. Make them key players. Drive priorities – even between business units Assignments – user validation AND acceptance testing, training, champion, etc Let them earn sweat equity.
1..n “The Fact Table” Expected & Actual
1..n “The Fact Table” Why does changing from 1..n to 1..1 sometimes resolve the issue? If you want a report that shows only patients with Charges AND Payments Why does changing it from 1..1 to 1..n sometimes resolve the issue? If you want a report that only shows patients with Charges OR Payments
1..n “The Fact Table” Examine the SQL in Report Studio -- (It’s too long!) The Fact Table creates two inner join “queries,” then performs a full outer join “query” between the two.
0..n “The Outer Fact Table” Expected & Actual
0..n “The Outer Fact Table” Examine the SQL in Report Studio -- (It’s too long, again!) The Outer Fact Table creates two outer join “queries”, then performs a full outer join “query” between the two.
Considerations Tuning: Indexes should be evaluated based on the sub-queries they create Ignore the very misleading FM Relationship Impact description: 1..n/0..n means MUCH more and has BIGGER impact/ramifications
Conclusion Which do we use – 1..1, 0..1, 1..n, 0..n? Know the reporting requirements Dimensional data warehouses should typically leverage 0..n or 1..n cardinality Know the data, how to present it, and the capabilities
Questions? Final slide