Download presentation
Presentation is loading. Please wait.
Published byArlette Lefèvre Modified over 6 years ago
1
AUDIT AND VALIDATION TESTING FOR BIG DATA APPLICATIONS
Ravi Shukla, Specialist Senior Deloitte Consulting Pvt. Ltd
2
Abstract In today’s world, we are awash in a flood of data. Across a broad range of application areas, data is being collected at unprecedented scale. Decisions that previously were based on guesswork, can now be made based on the data itself. Organizations are seeing big data analytics as means to reduce cost and improve co-ordination, quality and outcomes. To allow for better management, organizations are ensuring that their data, present in different systems are migrated to a Distributed File System. This paper attempts to highlight the significance of Audit & Validation testing approach in the big data application landscape.
3
What is Big Data Big data describes large volume of data, both structured and unstructured, that inundates a business on a day-to-day basis. Big data helps in making better decisions and providing accurate insights leading to strategic business moves.
4
What is Big Data….. Big Data can be classified based on 3 V’s:
5
Testing challenges in Big Data Analytics
Key Challenges in testing big data applications are following:
6
Audit & Validation process – How it can be a solution?
Audit and Validation process enables the verification and validation of data flowing into the big data systems. Under the A&V process, the approach is to have processes setup that perform similar transformation logic as that of development team responsible for extracting data from different sources and loading large datasets into target systems.
7
Audit & Validation process – How it can be a solution?
The audit and validation extracts generated are compared with the one generated by the development team using the proposed Audit and Validation process which categorize the testing on the basis of: Auditing test results – This step validates the extract criteria and tests if all the data has been extracted from source systems and loaded into the target big data system. Validating test results- Verifies the transformation logic of data during conversion. This is to ensure that the transformation rules have been applied correctly over the source data to be extracted for load to target systems.
8
Audit & Validation Process
1. The first level of testing is performed as soon as the data is extracted 2. The second level of testing is performed after the data is loaded into the big data system
9
Case Study: Netflix, the world’s leading internet television network, uses big data analytics to analyze billions of bytes of data across more than 150,000 applications daily in real time Netflix uses big data analytics for: Predicting viewing habits Improving Search Quality Finding next smash hit series High quality experience Recommendation Engines Improved ratings Netflix has made use of Audit and validation techniques To ensure that data from different sources such as devices, program searches are collected and loaded into the target systems. Performing correct data analytics for predicting user viewing habits and ensuring recommendations based on users search criteria.
10
Refined & Validated Data
Relevance of Audit and Validation process Heterogeneous Data Audit & Validation Refined & Validated Data Enabling organizations to measure and know more about their businesses In organizations where here is a huge amount of data movement, a quality check to ensure if the huge data has moved as expected from source systems to target big data applications becomes imperative. The A&V approach allows a check point to ensure data getting loaded into the big data systems such as Hadoop is accurate and consistent with data sent across from different source systems.
11
Benefits of Audit and Validation
Independent validation ensuring accurate quantity of records are available in target big data systems for analysis Reduces maintenance costs as the massive amount of data available for analysis allows organizations to spot issues and predict when they might occur. Allows organizations to perform risk analysis Process ensures data quality Results in accurate data analysis
12
Conclusion Big Data testing has much prominence for today’s businesses. If right test strategies are embraced and best practices are followed, defects can be identified in early stages and overall testing costs can be reduced while achieving high Big Data quality. The Audit and validation process empowers testing teams to accurately determine if there are any inconsistencies regarding data flowing into system, and help organization take corrective measures in case discrepancies are identified. Audit and Validation provides a capability to analyze astonishing data sets quickly and cost-effectively.
13
References & Appendix “Hype Cycle around Big Data Analytics”. Published by Forbes
14
Author Biography Ravi Shukla
Software professional with 12 years’ experience in Industry having worked primarily on Data warehousing projects and healthcare domain. Ravi is currently working as a Test Program Manager for a leading Healthcare provider based in California, USA. He is an avid traveler, loves hiking and playing volleyball.
15
Thank You!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.