Download presentation
Presentation is loading. Please wait.
1
TESTING OF BIG DATA & PREDICTIVE ANALYTICS
Asvin Kumar- Associate consultant Capgemini
2
ABSTRACT This whitepaper is drafted considering the scope of Automation in testing practice. There is always a dilemma on the ways big data and Predictive analysis can be used in day to day activities to make life simple. Data science is all about trying to create a process that allows you to chart out new ways of thinking about problems that are novel, or trying to use the existing data in a creative atmosphere with a pragmatic approach. Testing Big Data applications requires a specific mind set, skillset and deep understanding of the technologies, and rational approach to data science. Big Data from a tester’s perspective is an interesting aspect. Understanding the evolution of Big Data, What is Big Data meant for and Why Test Big Data Applications is fundamentally important.
3
BIG DATA AND PREDICTIVE ANALYTICS
There is no rule, unique method or set of tools for Big Data study. This is mainly due to the huge volume, much complex, and heterogeneity of such data sets. There were fundamental gaps in our knowledge to view the high-dimensional space To maintain equilibrium for Big Data Science, significant experiments is required to develop the core principles of solid methods to achieve more precision scientific insights based on Big Data sets.
4
BUSINESS CHALLENGES Abundant Need for live integration: With multiple sources of information from different data available in the market, it has become impending to expedite live integration of information. This forces companies to have clean and reliable data, but the fact is there is no such thing called a clean data. Predictive analysis comes into play when data mining is in place. As time progresses current systems which handle bid data will learn the way to reject the junk from all the data sets. This can only be archived when algorithms are designed to learn from past experience of relative data. Instantaneous Data Collection and Deployment: Influence of Predictive analytics and its ability to take Decisive Actions have pushed companies to adopt Instantaneous data collection solutions. These decisions bring in great business impact by leveraging the insights from the minute patterns in large data sets. Applications and data sets needs to be tested and certified for live deployment. Real-time challenges: Big Data Applications are built to handle tremendous level of data processing that is involved in a given data sets. Critical errors in the architecture controlling the design of Big Data Applications can lead to catastrophic situations. Uncompromising testing is needed which involves smarter data sampling and sorting techniques coupled with high end performance testing units.
5
LEARNING TRENDS IN BIG DATA
Deep Learning from High Volumes of Data: High volumes of data present a great challenging issue for deep learning. This directly lead to running-time complexity. The absolute volume of data makes it often unmanageable to train a deep learning algorithm with a central processor and storage. Ultimately, to build the future deep learning system scalable to Big Data, one needs to develop high performance computing infrastructure-based systems together with theoretically sound parallel learning algorithms or novel architectures. Deep Learning for High Variety of Data: Data today comes in all types of formats from a variety sources, probably with different distributions. For example, the rapidly growing multimedia data coming from the Web and mobile devices include a huge collection of still images, video and audio streams, graphics and animations, and unstructured text, each with different characteristics. Deep Learning for High Velocity of Data :Data is generated at extremely high speed and need to be processed in a timely manner. One solution for learning from such high velocity data is online learning approaches. Online learning learns one instance at a time and the true label of each instance will soon be available, which can be used for refining the model
6
EMERGING TESTING TRENDS IN BIG DATA
Instant deployment testing: Today most of the Big Data applications are developed to handle predictive analytics, which works on instant data collection and deployment. Since these forecasts can have a substantial impact on business decisions, comprehensive application testing is critical so that instantaneous deployment goes off without a hitch. Scalability testing: As mentioned above, when we talk about Big Data, we are necessarily talking about huge volumes. Naturally, scalability testing plays an increasingly important role in the general testing process. In support of this task, the application’s architecture should be tested with smart data samples and it should be able to scale up without compromising on performance. Security testing: Security testing is another emerging trend for Big Data applications. Because Big Data is usually drawn from a variety of sources, and often confidential, security is essential. To ensure data security and personal privacy in an age when hacking threats are all too common, different testing mechanisms are applied to different layers of the application. Ex: Ransom ware attack affected over 1 Billion computer systems across globally in 2017. Performance testing: Big Data applications work with live data for real time analytics so performance is key. Performance testing goes hand in hand with other types of testing, including scalability testing and live integration testing.
7
CONCLUSION AND RECOMMENDATION
To be successful, Big Data testers have to learn the components of the Big Data ecosystem from scratch. Since the market has created fully automated testing tools for Big Data validation, the tester has no other option but to acquire the same skill set as the Big Data developer in the context of leveraging Big Data technologies like Hadoop. This requires a tremendous mind set shift for both testers and testing units within organizations. In order to be competitive, companies should invest in Big Data-specific training needs and developing the automation solutions for Big Data validation REFERENCES Apicasystem- Trends of big data testing - Web reference Ngiam, "Multimodal deep learning", Proc. 28th Int. Conf. Mach. Learn., 2011. AUTHOR BIOGRAPHY Asvin Kumar woks as a associate consultant at Capgemini, Hyderabad with 4.6 Years of experience in testing practice mainly in Investment banking and retail domain. He holds a Bachelor degree in Electrical and Electronics and also ISTQB certified.
8
Thank You!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.