Conquering Data Conversion Projects
Who is that furry guy anyway? Austin Zellner = presenter 15+ years Information Technology Multiple large data migration projects Recognized patterns to these projects
Have you ever read Dune? Data = life force of information systems Data migration least well understood projects by management. High Risk / Low tolerance for error
Phases of Data Conversion Discovery Mapping Programming Conversion Post Conversion
Discovery Phase Define Success = Guiding Principle Need to know “When” Need to understand options and Scope
Discovery Phase – Gathering your forces Get an inventory of your Knowledge Knowledge: Documentation Experts System Self Documentation Source Code
Discovery Phase – Intelligence Gathering Based on Guiding Principle: Generate transactions in source system Generate transactions in target system Identify tables / data modified in source Identify tables / data modified in target This will give a rough estimate of what needs to be touched
Discovery Phase – Siege Points Identify access to target system Direct DB Access = Flexible / High Risk Import Tools = Consistent / Lower Risk Data Entry = Slow / Least Risk
Discovery Phase – Changing Business Practices May need to change business practices Change code Change business process Convert data to support
Discovery Phase – Going from Apples to Oranges Differing data models adds Risk Structural differences Conceptual differences Must be accounted for in timeline
Discovery Phase – Watch out for Fuzziness Fuzzy data = values relative to the user Fields overridden Value means different things in time Fields co-opted by departments
Discovery Phase – Transaction Types Treat Transaction Types as distinct May require specific mapping Source valuable for understanding
Discovery Phase – Calculating Mapping Phase Calculating Mapping Phase ( in hours ) Each field in source and target =.25 Knowledge Modifiers: No documentation / source = x2 Apples to Oranges = x3
Discovery Phase – Calculating Programming Phase Calculating Programming Phase ( in hrs ) Each table / view = 5 Modifiers: New to any tools = x2 Direct to DB with Triggers = x3 QA = Total x2
Discovery Phase – More stuff you have no control over Buck stops here designated Provide pros / cons for decisions May require secrecy
Discovery Phase – End of the Beginning What you should have: -Agreement on what “success” is -Delivery date based on calculations -Buck stops here designated -Test system matching source / target -Real test data -QA Team identified
Mapping Phase - Overview Mapping is getting the data from the source system to the target so that the “information” is preserved Automation <> Understanding
Mapping Phase – Two types of data Business data = describes record System data = record’s state in system
Mapping Phase – Practical Approach From Target Perspective A field level listing of tables For each field: Identify source value that fills Business Identify logic that fills System
Mapping Phase – Duplicate Data If merging two datasets, watch for dupes 2 types: Duplicate Keys = renumbering Duplicate Values = merging In both cases, requires creating XRef
Mapping Phase – System Maintenance Screens Systems often have “global” configuration screens Often have explanations of what obscure codes mean Easier to just manually set in target system than to convert
Mapping Phase – When you are done Should have the following at end: -Master document showing field mappings from source to target -Xref for converted keys, values -Notes on logical conversion of system data from source to target
Programming Phase - Overview Bringing records from source to target Simplicity is key Build safeguards to protect accidental launch
Programming Phase – Example Code One program per table Establish connection to source and target Read in from source -For each column is there anything to be modified? -> Yes -> log old / new value -Write to target Loop until done
Conversion Phase - Preparing Plan as if major outage -Coordinate with departments affected -Have black out period for system use -No new records into system after cutoff Central Gatekeeper designated
Conversion Phase – Conversion Time Gatekeeper verifies backups in place -All batch jobs / sql tasks / etc. stopped -Initiate the conversion process -Monitor logs for success -Once complete, begin test transactions -Once signed off, start processes -Allow users back in
Post Conversion – Work left to do Watch for broken records -Spot fix individual, batch groups Take care around special events -Month end, year end -New transactions, closing transactions The Spice Must Flow