TESTING OF BIG DATA & PREDICTIVE ANALYTICS

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

Nokia Technology Institute Natural Partner for Innovation.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
ATSN 2009 Towards an Extensible Agent-based Middleware for Sensor Networks and RFID Systems Dirk Bade University of Hamburg, Germany.
Amadeus Travel Intelligence ‘Monetising’ big data sets
Management Information Systems
Charles Tappert Seidenberg School of CSIS, Pace University
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Designing Cisco Data Center Unified Fabric
Chapter 1 – A Geographer’s World
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Data Analytics (CS40003) Introduction to Data Lecture #1
Sub-fields of computer science. Sub-fields of computer science.
Data warehousing AND Data mining PRESENTED by N.GANESH (10QF1A0447)
Building a Data Warehouse
Digital transformation, which often includes establishing big data analytics capabilities, poses considerable challenges for traditional manufacturing.
CompSci 280 S Introduction to Software Development
Ch 1 A Geographer’s World
SNS COLLEGE OF TECHNOLOGY
What is it ? …all via a single, proven Platform-as-a-Service.
Big Data Panel Discussion
Digital Transformation Services
Monetizing IoT in India
Lesson 24 Creating & Distributing New Media Content.
SAS Education Practice
BIG DATA IN ENGINEERING APPLICATIONS
of our Partners and Customers
Chapter 1- Introduction
Science Behind Cross-device Conversion Tracking
Joseph JaJa, Mike Smorul, and Sangchul Song
Trends in my profession, Information Technology
Data Quality: Practice, Technologies and Implications
A Big Data Cheat Sheet: The Big Pharma Edition
DEFECT PREDICTION : USING MACHINE LEARNING
Frequently asked questions about software engineering
AI emerging trend in QA Sanjeev Kumar Jha, Senior Consultant
DILV -Data Integrity and Lifecycle Validator
Journey of Quality Analysts towards Data Analytics
Script-less Automation: An Approach to Shift-Left.
Azure Allays Data Security and Privacy Concerns for Solution Provider’s Cloud-Wary Customers “With Microsoft Azure, Jedox can deliver advanced enterprise.
Lesson 24 Creating & Distributing New Media Content.
Cognitive Software Delivery Using Intelligent Process Automation (IPA)
Automating Profitable Growth™
A Must to Know - Testing IoT
AUDIT AND VALIDATION TESTING FOR BIG DATA APPLICATIONS
CMG India Annual Conference 2016
7 powerful questions of Data Science
A Berkeley View of Systems Challenges for AI
Oracle GL : Jack of E-Business Suite
MBML_Efficient Testing Methodology for Machine Learning
Automating Profitable Growth™
Crypteron is a Developer-Friendly Data Breach Solution that Allows Organizations to Secure Applications on Microsoft Azure in Just Minutes MICROSOFT AZURE.
Microsoft Azure Enables Big-Data-as-a-Service Applications for Industry and Government Use “Microsoft Azure is the most innovative and robust suite of.
Automating Profitable Growth™
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Copyright © JanBask Training. All rights reserved Why learn Hadoop & big data technology in 2019?
Copyright © JanBask Training. All rights reserved Become AWS Certified & Get Amazing Job Opportunities.
CS385T Software Engineering Dr.Doaa Sami
Dep. of Information Technology By: Raz Dara Mohammad Amin
E-BUSINESS E-Business is the powerful business environment that is
Automating Profitable Growth™
Nanotechnology & Society
Big DATA.
Multimedia and Projects
Data Analysis and R : Technology & Opportunity
I4.0 in Action The importance of people and culture in the Industry 4.0 transformation journey Industry 4.0 Industry 3.0 Industry 2.0 Industry 1.0 Cyber.
OU BATTLECARD: Oracle Identity Management Training
OU BATTLECARD: E-Business Suite Courses and Certifications
Collaborative regulation in the digital economy
Presentation transcript:

TESTING OF BIG DATA & PREDICTIVE ANALYTICS Asvin Kumar- Associate consultant Capgemini

ABSTRACT This whitepaper is drafted considering the scope of Automation in testing practice. There is always a dilemma on the ways big data and Predictive analysis can be used in day to day activities to make life simple. Data science is all about trying to create a process that allows you to chart out new ways of thinking about problems that are novel, or trying to use the existing data in a creative atmosphere with a pragmatic approach. Testing Big Data applications requires a specific mind set, skillset and deep understanding of the technologies, and rational approach to data science. Big Data from a tester’s perspective is an interesting aspect. Understanding the evolution of Big Data, What is Big Data meant for and Why Test Big Data Applications is fundamentally important.

BIG DATA AND PREDICTIVE ANALYTICS There is no rule, unique method or set of tools for Big Data study. This is mainly due to the huge volume, much complex, and heterogeneity of such data sets. There were fundamental gaps in our knowledge to view the high-dimensional space To maintain equilibrium for Big Data Science, significant experiments is required to develop the core principles of solid methods to achieve more precision scientific insights based on Big Data sets.

BUSINESS CHALLENGES Abundant Need for live integration: With multiple sources of information from different data available in the market, it has become impending to expedite live integration of information. This forces companies to have clean and reliable data, but the fact is there is no such thing called a clean data. Predictive analysis comes into play when data mining is in place. As time progresses current systems which handle bid data will learn the way to reject the junk from all the data sets. This can only be archived when algorithms are designed to learn from past experience of relative data.  Instantaneous Data Collection and Deployment: Influence of Predictive analytics and its ability to take Decisive Actions have pushed companies to adopt Instantaneous data collection solutions. These decisions bring in great business impact by leveraging the insights from the minute patterns in large data sets. Applications and data sets needs to be tested and certified for live deployment.  Real-time challenges: Big Data Applications are built to handle tremendous level of data processing that is involved in a given data sets. Critical errors in the architecture controlling the design of Big Data Applications can lead to catastrophic situations. Uncompromising testing is needed which involves smarter data sampling and sorting techniques coupled with high end performance testing units.

LEARNING TRENDS IN BIG DATA Deep Learning from High Volumes of Data: High volumes of data present a great challenging issue for deep learning. This directly lead to running-time complexity. The absolute volume of data makes it often unmanageable to train a deep learning algorithm with a central processor and storage. Ultimately, to build the future deep learning system scalable to Big Data, one needs to develop high performance computing infrastructure-based systems together with theoretically sound parallel learning algorithms or novel architectures. Deep Learning for High Variety of Data: Data today comes in all types of formats from a variety sources, probably with different distributions. For example, the rapidly growing multimedia data coming from the Web and mobile devices include a huge collection of still images, video and audio streams, graphics and animations, and unstructured text, each with different characteristics. Deep Learning for High Velocity of Data :Data is generated at extremely high speed and need to be processed in a timely manner. One solution for learning from such high velocity data is online learning approaches. Online learning learns one instance at a time and the true label of each instance will soon be available, which can be used for refining the model 

EMERGING TESTING TRENDS IN BIG DATA Instant deployment testing: Today most of the Big Data applications are developed to handle predictive analytics, which works on instant data collection and deployment. Since these forecasts can have a substantial impact on business decisions, comprehensive application testing is critical so that instantaneous deployment goes off without a hitch.  Scalability testing: As mentioned above, when we talk about Big Data, we are necessarily talking about huge volumes. Naturally, scalability testing plays an increasingly important role in the general testing process. In support of this task, the application’s architecture should be tested with smart data samples and it should be able to scale up without compromising on performance.  Security testing: Security testing is another emerging trend for Big Data applications. Because Big Data is usually drawn from a variety of sources, and often confidential, security is essential. To ensure data security and personal privacy in an age when hacking threats are all too common, different testing mechanisms are applied to different layers of the application. Ex: Ransom ware attack affected over 1 Billion computer systems across globally in 2017. Performance testing: Big Data applications work with live data for real time analytics so performance is key. Performance testing goes hand in hand with other types of testing, including scalability testing and live integration testing.

CONCLUSION AND RECOMMENDATION To be successful, Big Data testers have to learn the components of the Big Data ecosystem from scratch. Since the market has created fully automated testing tools for Big Data validation, the tester has no other option but to acquire the same skill set as the Big Data developer in the context of leveraging Big Data technologies like Hadoop.   This requires a tremendous mind set shift for both testers and testing units within organizations. In order to be competitive, companies should invest in Big Data-specific training needs and developing the automation solutions for Big Data validation REFERENCES Apicasystem- Trends of big data testing - Web reference Ngiam, "Multimodal deep learning", Proc. 28th Int. Conf. Mach. Learn., 2011. AUTHOR BIOGRAPHY Asvin Kumar woks as a associate consultant at Capgemini, Hyderabad with 4.6 Years of experience in testing practice mainly in Investment banking and retail domain. He holds a Bachelor degree in Electrical and Electronics and also ISTQB certified.

Thank You!!!