Data Preparation in the Quadstone System Version 5 Thursday, 17 th February am PST / 10.30am EST / 3.30pm GMT / CET Please join the teleconference call now; if you have any difficulty, contact Starting in 15 minutesStarting in 10 minutesStarting in 5 minutesStarting in 2 minutesStarting now
© 2005 Quadstone How to ask questions Return to WebEx Event Manager: Use Q&A (not Chat): You can return to full-screen view:
© 2005 Quadstone Data Preparation in the Quadstone System V5 Presenter: Joshua Lewis, Quadstone Consultant Overview: Interactive data-preparation: sorting, aggregating, joining, and deriving Audience: Experienced Quadstone System users looking to undertake ad-hoc preparation of data Format: A live demo with slides for sign-posting Follow-up exercises in the form of a workbook and dataset Duration: 1 hour, including Q&A
© 2005 Quadstone Interactive data preparation Wizards for each action Right-click in Quadstone System Explorer (QSE), or Click on (or drag file to) Quadstone System Shortcut Bar Choose parameters as appropriate Best practice: keep an audit trail
© 2005 Quadstone Transaction data Customer data To be filled Analysis dataset Customer IDs A simple data preparation process SOR T MEASUR E JOIN DERIVE
© 2005 Quadstone Join Fields adds fields from a secondary focus to a primary focus. Pre-requisites: Key fields in both foci, of the exact same datatype The records in both foci must be sorted by the key fields Subtly different from Import Fields in Decisionhouse Joining foci in QSE
© 2005 Quadstone Field derivation 22/10/ /05/ /09/ /11/ /03/ /01/ NULL CustomerID StartDate Age Gender /10/ /05/ /09/ /11/ /03/1995 NULL CustomerID StartDate Age Gender MonthsTenure Derivation
© 2005 Quadstone Deriving new fields Requires: A focus (need not be sorted) A derivations (.tml) file containing TML descriptions of the fields to be derived Create a new TML file via right-click in QSE, then right-click again to Edit Best practice: Develop and debug the FDL expressions interactively in Decisionhouse, before embedding them within the TML file
© 2005 Quadstone Derivation syntax create Tenure := countwholemonths(StartDate,today()); create YoungMan := Age < 30 and Gender = 1; Creates one output record per input record The first example counts the number of months since a person became a customer; the second creates a flag to identify a specific segment of customers General syntax: create := ; See online help: Field Derivation Language (FDL) reference
© 2005 Quadstone Sorting foci Required when combining foci and grouping records Sort will usually, but not always, be on customer ID Best practice: check sort order first, to see if a sort is needed Best practice: sort once, upstream (adding keys if needed), to minimize time-consuming re-sorting
© 2005 Quadstone Transaction measurement 05/02/ /02/ /02/ /02/ /03/ /02/ ATM SD CustomerID Date TransType Value /02/ /03/ /02/ … ATM SD CustomerID MostRecentDate MostFreqTrans AverageValue Rollup (aggregation)
© 2005 Quadstone Simple aggregation Requires: A focus (sorted by the grouping key field, e.g., CustomerID) Selection of an appropriate key field (for grouping records) An aggregations (.tml) file containing TML descriptions of the aggregations Ignore the Functions and Statistics options Example TML files are in the online help and the ext/demo/dbc folder of your installation
© 2005 Quadstone Aggregation syntax create NumberOfPurchases := count(); create ValueOfPurchases := sum(Amount); Processes each group of transaction records that share the same CustomerID (grouping key), to create one output record per CustomerID The first example counts the number of transactions for each customer; the second sums the values in the Amount field for each customer General syntax: create := ( ); See online help: Transaction Measurement Language (TML) reference
© 2005 Quadstone Joining foci ID Age A56 B23 C31 Customers.ftr ID TotalVisits A3 C2 D4 Visits.ftr Customers.ftr ID Age TotalVisits A56 3 B23 Null C31 2 Join on ID Sorted
© 2005 Quadstone Combining datasets Append fields: abut equal-length datasets Join fields: match on common key(s) Merge records: interleave using common key(s) += += +=
© 2005 Quadstone Metadata You can import metadata from a previous dataset using a template focus Includes all derivations, selections, binnings, interpretations and comments Allows re-use of metadata developed interactively in Decisionhouse You can import metadata in XML form Allows metadata (e.g. a data dictionary) to be maintained externally (converted to XML) Currently supports comments only
© 2005 Quadstone MEASURING qsderive qsmeasure qsmeasuretrack Advanced data preparation FOCUS qsbuild COMBINING qsjoin qsappendfields qsmerge IMPORTING qsdbaccess qsimportdb qsgenfdd qsimportflat qsimportfocus REPORTING qsdescribe qsaudit qsdtsnapshot qsscsnapshot qsxt qsxt2spec [qsinfo] EXPORTING qsdbcreatetable qsdbinsert qsdbupdate qsexportflat MANAGING qscopy qslink qsmove qsremove FOCUS TRANSFORMING qssort qsrenamefields qsselect ENHANCING qsimportmetadata qsupdate [qsinterp] [qsexportmetadata] qstml qssettings Advanced TML aggregation syntax (filter, split) Data Build Commands Data Build Manager
© 2005 Quadstone Where to find out more Quadstone System Help; for example: Quadstone System Help Working with flat files, database tables, and foci Transaction Measurement Language (TML) reference Field Derivation Language (FDL) reference Quadstone System data-build command and TML reference Quadstone System data-build command and TML reference Examples of TML ext/demo/dbc folder of your installation More example TML and data Quadstone System Support website: Advanced Data Preparation training course: contact
© 2005 Quadstone Questions and answers
© 2005 Quadstone After the webinar These slides, a workbook and data are available via Any problems or questions, please contact
© 2005 Quadstone Upcoming webinars See If there’s a webinar topic you’d like to see, please let us know via Pragmatic Scorecarding March 17, :00 UK/Ireland, 15:00 Central European, 9am Eastern Pragmatic Scorecarding March 18, 20059am Pacific, 11am Central, 12noon Eastern, 5pm UK/Ireland The Quadstone Portal April 14, :00 UK/Ireland, 15:00 Central European, 9am Eastern The Quadstone Portal April 15, 20059am Pacific, 11am Central, 12noon Eastern, 5pm UK/Ireland
© 2005 Quadstone Your feedback Please