CS 791z Graduate Topics on Software Engineering

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Advertisements

Quantitative Research and Analytics, Proprietary and Confidential1 Ryan Michaluk
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Running Hadoop-as-a-Service in the Cloud
Personalized Cybersecurity for Dummies Jaime G. Carbonell Eugene Fink Mehrbod Sharifi Application of machine learning and crowdsourcing to adapt cybersecurity.
Capabilities Briefing
Computational Thinking Related Efforts. CS Principles – Big Ideas  Computing is a creative human activity that engenders innovation and promotes exploration.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
 1. Introduction  2. Development Life-Cycle  3. Current Component Technologies  4. Component Quality Assurance  5. Advantages and Disadvantages.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
© Heikki Topi Data Science and Computing Education ACM Education Council Portland, OR September 16-17, 2014 Heikki Topi, Bentley University.
S EEQ C ORPORATION Big Data Oregon Connections Telecommunications Conference Dustin Johnson October 23, 2014.
Introduction to Computer and Programming CS-101 Lecture 6 By : Lecturer : Omer Salih Dawood Department of Computer Science College of Arts and Science.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Introduction to Business Intelligence
Information Explosion. Reality: New Machine-Generated Data Non-relational and relational data outside of the EDW † Source: Analytics Platforms – Beyond.
Chapter 1: Introduction Omar Meqdadi SE 2730 Lecture 1 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Introduction to Software Engineering. Why SE? Software crisis manifested itself in several ways [1]: ◦ Project running over-time. ◦ Project running over-budget.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
8/20/2013NIST Big Data WG / Roadmap Subgroup1 Architecture Storage Architecture Processing Architecture Resource Managers Architecture Infrastructure Architecture.
1 COMPUTER SCIENCE DEPARTMENT COLORADO STATE UNIVERSITY 1/9/2008 SAXS Software.
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
CERN openlab technical workshop
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
Smart Home Technologies
CS223: Software Engineering Lecture 2: Introduction to Software Engineering.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
‘BigExcel’ A Web-Based Framework for Exploring Big Data in Social Sciences Asif Saleem, Blesson Varghese and Adam Barker University of St Andrews, UK
Guided By Ms. Shikha Pachouly Assistant Professor Computer Engineering Department 2/29/2016.
Derek Weitzel Grid Computing. Background B.S. Computer Engineering from University of Nebraska – Lincoln (UNL) 3 years administering supercomputers at.
Big Data Yuan Xue CS 292 Special topics on.
Big Data Analytics Are we at risk? Dr. Csilla Farkas Director Center for Information Assurance Engineering (CIAE) Department of Computer Science and Engineering.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Big Data analytics in the Cloud Ahmed Alhanaei. What is Cloud computing?  Cloud computing is Internet-based computing, whereby shared resources, software.
Business Intelligence Overview. What is Business Intelligence? Business Intelligence is the processes, technologies, and tools that help us change data.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
B ig D ata Analysis for Page Ranking using Map/Reduce R.Renuka, R.Vidhya Priya, III B.Sc., IT, The S.F.R.College for Women, Sivakasi.
Microsoft Ignite /28/2017 6:07 PM
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Analytics (CS40003) Introduction to Data Lecture #1
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Connected Infrastructure
16CS202 & Software Engineering
SNS COLLEGE OF TECHNOLOGY
Big Data Enterprise Patterns
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
Big Data A Quick Review on Analytical Tools
Tutorial: Big Data Algorithms and Applications Under Hadoop
Connected Infrastructure
Eng Computation & Data Science.
Tools for Processing Big Data Jinan Al Aridhee and Christian Bach
Chapter GS Getting Started.
Big Data Young Lee BUS 550.
Chapter GS Getting Started.
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Chapter GS Getting Started.
Big DATA.
Panel on Research Challenges in Big Data
Chapter GS Getting Started.
CS 791Graduate Topics in Computer Science [Software Engineering]
Big Data.
Presentation transcript:

CS 791z Graduate Topics on Software Engineering Carlos E. Otero and Adrian Peter (2015), Research Directions for Engineering Big Data Analytics Software, IEEE Intelligent Systems 30 (1): 13-19. University of Nevada, Reno Department of Computer Science & Engineering

Outline Introduction Actionable Intelligence Big Data Analytics Software Engineering The Requirements Problem The Design Problem The Construction Problem The Testing Problem The Road Ahead

Engineering Big Data Analytics Software

Introduction What is big data? What is big data software? Can this software be engineered?

Introduction What is big data? Infrastructure perspective: 3V+U (high volume, velocity, and variety + unpredictability) Analytics perspective: data so large that it contains significant low probability events otherwise absent from regular data Business perspective: data that offer opportunities for gaining actionable intelligence Also (infrastructure): data so large that current typical methods cannot process it

Introduction Now, what is big data software? Characteristics: Software that supports the time-constrained processing of continuous information flows to provide actionable intelligence Characteristics: Includes both infrastructure software and analytics software (big throughput + big analytics software) Time-constrained (late response is wrong response) Continuous information flow (see big data’s 3V and data streams) Actionable intelligence, which produces information directly usable for strategic, operational, or tactical goals

Actionable Intelligence Actionable intelligence goes beyond finding and summarizing data; it stresses the discovery of patterns that can be used to predict concepts, events, trends, opinions, etc. to support decision-makers. Three levels of actionable intelligence (ai): L1: Supervised ai L2: Semi-supervised ai L3: Unsupervised ai

Intelligence production process Single hop intelligence production process

Intelligence production process Multi hop intelligence production process

Big Data Analytics Software Engineering Challenges encountered when engineering big data software pertain to: Requirements specification Design Construction Testing

Big Data Analytics Software Engineering The Requirements Problem: in essence, consists of dealing with time constraints and concept drifts (changes in statistical properties of the concept to be learned) Regular software Real-time software Big-data software

Big Data Analytics Software Engineering The addition of p (period of validity) requires a paradigm shift in how software contracts are thought about Mainstream applications for big data software encompass software for national security, government policy-making, and safety-critical applications

Big Data Analytics Software Engineering The Design Problem includes: Usability Performance Reliability: the state-of-the-art is insufficient to verify reliability; suggested approaches include adapting old and creating new architecture tactics (e.g., change detection components, detectors of software misbehavior) Availability Security: challenges include adversarial machine learning, execution-phase-attacks, training-phase-attacks, denial of service Interoperability Scalability Testability, etc.

Big Data Analytics Software Engineering The Construction Problem encompasses: Big data software requires more than just programming skills Lack of tools and frameworks that support assistive development (e.g., tools such as Excel, Octave, R, WEKA are not big data tools) The need for multi-processing technologies such as CUDA, GPUs, Map-Reduce, and Hadoop (new solutions are being created, e.g., Mahout, Spark -built on Scala- and Storm) The need for distributed database technologies underlying analytics computations (e.g., NoSql, SciDB) The requirement for multimedia processing, which increases complexity

Big Data Analytics Software Engineering The Testing Problem encompasses: Big data processing is impractical for humans, and many defects can go unnoticed Untrustworthy data, bad assumptions, incorrect mathematics Big data software belong to the class of non-testable software A possible approach: metamorphic testing

The Road Ahead SE research needed on Architectures, tools, frameworks Visualization and infrastructure software Usability of adaptive learning systems Parallel processing Data architecture and database management