Effectively Validate Query/Report: Strategy and Tool Steven Luo Sr. System Analyst Barnes & Noble Session id: 36993
Agenda PART I: Introduction – Why, when, and how to validate PART II: Strategy – Basic strategy – Advanced strategy PART III: Tool – Test script – Engine - sqlUnit PART IV: Conclusion
Why to Validate? Assure Data Accuracy – Managers need accurate data to make strategic decisions – A company’s sales team needs accurate data to launch market campaign Assure ETL (E xtraction, Transformation, and Loading ) Process Correctness – ETL tools – SQL scripts
When to Validate? Any of the following types of testing needs validation: – Unit testing – Integration testing – System testing – Acceptance testing – Maintenance & regression testing
How to Validate? Validation Automation Black Box Basic Strategy Advanced Strategy White Box Basic Strategy Advanced Strategy Manual Black BoxWhite Box
Agenda PART I: Introduction – Why, when, and how to validate? PART II : Strategy – Basic strategy – Advanced strategy PART III: Tool – Test script – Engine - sqlUnit PART IV: Conclusion
Validating Report/Query A Query or Report is actually a result set, so validating needs to answer two questions: Are you getting the result set right? – right data in each cell Are you getting the right result set? – exact number of records
A Typical Process of Generating Reports flat files staging tables materialized view T0T1Tn ETL steps repor t Stored procedure …...
Basic Strategy Assert that each (cell) in the result set matches the data in the source table – Number – String – Date – Result Set, etc. Sampling should be used if a result set is big Assert the right number of records in the result set Check duplicated…
Advanced Strategy Try to uncover invalid data item – Boundary Validation Count, sum, max, min, x not in table, “between..and” on whole resultset or certain partitions. – Special Value (constraints) Validation isXXX() and notXXX() e.g. IsNull, notNull, notNegative, notZero – Business Rule Validation dept1.sale> dept2.sale
Agenda PART I: Introduction – Why, when, and how to validate ? PART II : Strategy – Basic Strategy – Advanced Strategy PART III: Tool – Test Script – Engine - sqlUnit PART IV: Conclusion
Why Use a Validation Tool? Automate the validating process – Reduces the cost, time and effort Reuse the procedures – Write once, run many times on QA box and/or production box. Re-factor SQL Share by group (save to PVCS)
Tool Open Source – jUnit, etc. – Steven Feuerstein’s utPL/SQL My tool -- sqlUnit
sqlUnit Overview The framework consists of two major parts – test scripts – test engine implemented in java stored procedure with PL/SQL interface Implement 2 types of strategy – basic strategy – advanced strategy Record the validating results Monitor long-running validation process
sqlUnit Overview (2) From the Engine Perspective – run all your test cases defined in your PL/SQL package From the User Perspective – write all test scripts – start the Engine
Architecture Test scripts Engine Assert util Database ……
Test Script (1) Write Test script in PL/SQL Define test package Call APIs Define test procedures – Test procedure MUST begin with ‘test’ – Setup() : – Teardown(): clean up
Test Script (2) Use your business knowledge to get expected data and actual data – Get raw or original data from source table such as POS, Daily Sales, etc. – Get data from a report call API fetchCursorData(…), or using cursor directly, e.g. cc := my_test_pkg.get_ref_cursor('SCOTT'); loop fetch cc into value1,value2,..., valuek; exit when cc%notfound ; if(...) then sqlunit.assert(‘desc’, value1, 100); end if; end loop;
APIs for Basic Strategy procedure runTestCases(testPackageName varchar2) procedure assert(description varchar2, num1 number, num2 number) procedure assert(description varchar2, str1 varchar2, str2 varchar2) procedure assert(description varchar2, a1 STRING_ARRAY, a2 STRING_ARRAY) procedure assert(description STRING_ARRAY, a1 STRING_ARRAY, a2 STRING_ARRAY) procedure assertQuery(description varchar2, query1 varchar2, query2 varchar2) procedure fetchRefCursorInto(pname in varchar2, parameters in STRING_ARRAY, fetchfields in out STRING_ARRAY, uniqField in varchar2, uniqValue in varchar2 ) function getCountForQuery(sqlstr varchar2) return number function getCountForProcedure(sqlstr varchar2) return number
Test Script Template
Apply Basic Strategy A tool should pick up: – m column(s) in a row of a result set m: between 1 and number of columns – m column(s) in n rows of a result set m: between 1 and number of columns n: between 2 and number of rows Cells to be validated: – cells: between 1 and m*n
Apply Advanced Strategy Advanced Strategy – Boundary Object – Special value (Constraint)Object – Business Rule Object
Test Script for Advance Strategy procedure test_adv_1 is obj sqlunitBoundary := sqlunitBoundary(NULL,NULL,NULL,NULL); begin obj.setTestingQuery('my_test_pkg.get_ref_cursor(''SCOTT'')'); obj.setCountCriteria(2); obj.setQueryCriteria('c1 is not null'); obj.setQueryType(1); //1 : store procedure. 0: sql query obj.checkBoundary; exception when others then dbms_output.put_line('exception! '); end test_adv_1;
Example for Validating Two Queries
How Engine Works Users start the Engine by calling … exec sqlunit.runtestcases('VALIDATEPACKAGE'); Engine calls back test scripts by calling the following: set_up; test_1; tear_down; set_up; test_2; tear_down;... set_up; test_n; tear_down; Test scripts call Framework APIs …. sqlunit.assert(...)
Sequence Diagram testPackagesqlUnit/engine Runtestcases() setup test_1 tear_down assert DB setup test_2 assert tear_down recordResult View results
Record the Test Result Use package name as testing result table name.
Monitoring Validation Process At Engine level, by instrumentation DBMS_APPLICATION_INFO into Engine, you can monitor the progress At test script, you can instrument DBMS_APPLICATION_INFO to test scripts too. Get the progress information from v$session_longop in other session
Test Scripts Guideline Be simple Don’t use the same sql that generates the report Use Business/domain knowledge
Conclusion Data accuracy is very important Use automated validating tool whenever possible
Limitation of validation “ Program testing can be used to show the presence of bugs, but never to show their absence” --E. W. Dijkstra
References Asimkumar Munshi, Testing a Data Warehouse Application white paper B. Hailpern and P. Santhanam Software debugging, testing, and verification IBM System Journal Vol. 41, No. 1, 2002 Thomas Kyte, Expert One on One Oracle, Wrox, 2001
A Q & Q U E S T I O N S A N S W E R S