GENERIC ETL DESIGN VARADARAJAN VASU SENIOR PROJECT MGR/ARCHITECT POLARIS SOFTWARE LAB
OBJECTIVE Application area is bifurcated as ETL / Reporting. Major Operations Select/Insert/Update/Delete To replace existing primitive methods used for ETL Design/ Automation System should be highly intelligent to do all jobs on behalf of users Build a comprehensive solution once and use it across verticals
PERT PROCESS PERT Stands for PROGRAM EXECUTION on REMOTE TERMINALS Different from Program Evaluation Review Technique used by SEI Technology used in Client/Server architecture
PERT PROCESS FLOW PERT START FREE SPACE CHECK ORACLE PROCESSES CHECK
EXECUTABLE PRESENCE CHECK PROCEDURE VALIDITY CHECK CHECK FOR PARELLEL RUN CHECK FOR RESTARTABILITY
1. SYSTEM INTELLIGENT CHEKS - PARAMETERISED 3. DETERMINE STAGING RUN INFORMATION - PARAMETERISED 4. STAGE REFRESH LOADER 5. GATHER FINAL REFRESH INFORMATION - PARAMETERISED 6. FINAL REFRESH LOADER 8. MAKE SYSTEM READY FOR NEXT DAY RUN - PARAMETERISED SUCCESS PERT END 7. DATA VALIDATION CHECKS - PARAMETERISED 2. DATE CHANGE - PARAMETERISED
SYSTEM INTELLIGENT CHECKS - Examples SPACE CHECK OBJECTS VALIDITY CHECK EXECUTABLES VALIDITY CHECK PROCESS RUNNING CHECK PREVENT SUCCESS RUN PREVENT PARELLL RUN RESTARTABILITY HANDLE UNAVOIDABLE INTERRUPTS FROM OS
OPERATION READINESS- Examples ARCHIVE INDEXING COMMUNICATING WITH EXTERNAL PARTIES MAILING COMPILING ETL EXECUTION STATISTICS MOVING OBJECTS TO RESPECTIVE LOCATION ANALYZING CLEANUP EXERCISE
SALIENT FEATURES OF PERT SPACE CHECK PROCEDURE OBJECTS VALIDITY CHECK EXECUTABLES VALIDITY CHECK PREVENT SUCCESS RUN PREVENT PARELLL RUN RESTARTABILITY PROVISION TO SCHEDULE FOR UPCOMING RUN FREQUENCIES BETTER ERROR LOGGING HANDLE UNAVOIDABLE INTERRUPTS FROM OS Load check for staging, Final Provision for manual run
Design is dynamic in nature Limited time availability to plug in new facility Avoid redundancy in coding & testing efforts Sleeping beauty is cost effectiveness Restart facility to start from the aborted place during data extraction and population ETL solution can be used for other similar ETL applications. ADVANTAGES
Requirements Gathering Database Design Performance in Execution CHALLENGES
CASE STUDY
NEAR REAL TIME EDW POPULATION
CASE STUDY PROCESSING
THANK YOU