Dep. of Information Technology By: Raz Dara Mohammad Amin

Slides:

Advertisements

Similar presentations

Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.

Advertisements

Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.

BIG DATA – WHAT’S THE BIG DEAL The call would start soon, please be on mute. Thanks for your time and patience.

Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.

Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories

CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.

Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.

W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.

Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.

+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.

Summarized by: Reza Gorgan Mohammadi Artificial Intelligence Laboratory (ISLAB) September 2014 Charectristics of Large Scale.

CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.

What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.

IoT Meets Big Data Standardization Considerations

Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.

Group : Lê Minh Hi ế u Phan Th ị Thanh Th ả o Ph ạ m Hoàng Long Nguy ễ n Huy Hùng 1 BIG DATA – NoSQL Topic 1.

Easy-to-Use RedFlag System Delivers Notifications via Phone, , Text, Social Media, and More to Improve Effectiveness of Your Communications COMPANY.

{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.

Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.

BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.

Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.

BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.

BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.

What is the Big Data Challenge? Organizations are seeking solutions that combine the real-time analytics capabilities of SAP HANA and accessibility to.

BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.

Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit

A Tutorial on Hadoop Cloud Computing : Future Trends.

Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.

Data Analytics (CS40003) Introduction to Data Lecture #1

Big Data & Test Automation

CNIT131 Internet Basics & Beginning HTML

SNS COLLEGE OF TECHNOLOGY

Big Data Enterprise Patterns

MapReduce Compiler RHadoop

Sushant Ahuja, Cassio Cristovao, Sameep Mohta

Viewing Data-Driven Success Through a Capability Lens

Introduction to Distributed Platforms

By Chris immanuel, Heym Kumar, Sai janani, Susmitha

ANOMALY DETECTION FRAMEWORK FOR BIG DATA

Fundamentals of Information Systems, Sixth Edition

Prepared by: Assistant prof. Aslamzai

Tutorial: Big Data Algorithms and Applications Under Hadoop

Chapter 14 Big Data Analytics and NoSQL

Azure-Powered beaconsmind Suite Connects with CRM and POS Systems and Offers Dashboards with Data Insights to Boost Sales and Customer Loyalty MICROSOFT.

TABLE OF CONTENTS. TABLE OF CONTENTS Not Possible in single computer and DB Serialised solution not possible Large data backup difficult so data.

Big Data Intro.

Rahi Ashokkumar Patel U

Hadoop Clusters Tess Fulkerson.

© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024Low Power Wide Area Network.

Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.

Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.

A Must to Know - Testing IoT

Chapter 1 Database Systems

Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.

Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.

Computer Fundamentals

TRANSFORMATION OF INFORMATION TECHNOLOGY IN INSURANCE BUSINESS MODELS

Overview of big data tools

AIMS for BizTalk, Built on the Microsoft Azure Platform, Empowers Enterprises to Automate Insight and Analytics and Boost Value Creation MICROSOFT AZURE.

Big Data Young Lee BUS 550.

FileFacets Information Governance Solution Performs High-Quality Automated Enterprise Content Management Migration, Built on Azure MICROSOFT AZURE APP.

Improve Patient Experience with Saama and Microsoft Azure

INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA

Zoie Barrett and Brian Lam

SERVER INNOVATION ACCELERATES IT TRANSFORMATION

Big Data Analysis in Digital Marketing

School Districts Can Analyze and Report on Data Across Multiple Systems with EdWire, a Powerful Integration Solution that Utilizes Microsoft Azure MICROSOFT.

Presentation transcript:

Dep. of Information Technology By: Raz Dara Mohammad Amin Ishik University Faculty of Science Dep. of Information Technology By: Raz Dara Mohammad Amin

What is BIG DATA? Insights from big data can enable all employees to make better decisions—deepening customer engagement, optimizing operations, preventing threats and fraud, and capitalizing on new sources of revenue. Big data is being generated by everything around us at all times.

What is BIG DATA? It is creating a culture in which business and IT leaders must join forces to realize value from all data. Insights from big data can enable all employees to make better decisions—deepening customer engagement, optimizing operations, preventing threats and fraud, and capitalizing on new sources of revenue. But escalating demand for insights requires a fundamentally new approach to architecture, tools and practices.

What is BIG DATA? More than 300 Petabyte daily! A petabyte (PB) is 1015 bytes of data (or 1,000 TB)

Why are we interested in BIG DATA? Data is the new “OIL” resource, difficult to extract, it comes in many types, difficult to refine However, it if obtained it has the potential to profoundly affect every aspect of our lives.

Why are we interested in BIG DATA?

Why are we interested in BIG DATA? Driving new insights from previously untouched data & integrate that insight into current processing. The application of new tools to do more analytics on more data for more people.

Characteristics of BIG DATA We need to understand the characteristics of BIG DATA first.

Characteristics of BIG DATA Volume: data will be x50 in 2020 (compared to 2010) The 4 V’s Velocity: data is generated at much higher rates as more devices are becoming digital. Variety: 80% of the data is unstructured having different forms. Veracity: accuracy or conformity of facts is of great importance.

BIG DATA Analytics: The Picture The major software components and their role

BIG DATA Echosystem: Basics Growing storage problems. Handling massive quantity of structured and unstructured data. Cost of processing. Fault tolerance. Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment. It uses commodity hardware (PCs connected to network). Supports distributed file storage via HDFS with replication. Processing with fault tolerance.

Storage & Processing ! Size limited by the size of local storage and processing power is bound by the CPU capabilities. Cost, Feasibility, Advantage?

Storage & Processing Shared distributed storage , distributed processing

Hadoop, HDFS and others?

Hadoop, HDFS and others? * ACID (Atomicity, Consistency, Isolation, Durability)

BIG DATA & Computer Science Data Structures. Programming Languages (Java, Python, Scala, R) Artificial Intelligence (Data Mining, NLP, Machine Learning) Networking. Etc.

Who are the players?

Thank You