Dep. of Information Technology By: Raz Dara Mohammad Amin

Slides:



Advertisements
Similar presentations
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
Advertisements

Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
BIG DATA – WHAT’S THE BIG DEAL The call would start soon, please be on mute. Thanks for your time and patience.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Summarized by: Reza Gorgan Mohammadi Artificial Intelligence Laboratory (ISLAB) September 2014 Charectristics of Large Scale.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.
Group : Lê Minh Hi ế u Phan Th ị Thanh Th ả o Ph ạ m Hoàng Long Nguy ễ n Huy Hùng 1 BIG DATA – NoSQL Topic 1.
Easy-to-Use RedFlag System Delivers Notifications via Phone, , Text, Social Media, and More to Improve Effectiveness of Your Communications COMPANY.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
What is the Big Data Challenge? Organizations are seeking solutions that combine the real-time analytics capabilities of SAP HANA and accessibility to.
BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
A Tutorial on Hadoop Cloud Computing : Future Trends.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Analytics (CS40003) Introduction to Data Lecture #1
Big Data & Test Automation
CNIT131 Internet Basics & Beginning HTML
SNS COLLEGE OF TECHNOLOGY
Big Data Enterprise Patterns
MapReduce Compiler RHadoop
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
Viewing Data-Driven Success Through a Capability Lens
Introduction to Distributed Platforms
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Fundamentals of Information Systems, Sixth Edition
Prepared by: Assistant prof. Aslamzai
Tutorial: Big Data Algorithms and Applications Under Hadoop
Chapter 14 Big Data Analytics and NoSQL
Azure-Powered beaconsmind Suite Connects with CRM and POS Systems and Offers Dashboards with Data Insights to Boost Sales and Customer Loyalty MICROSOFT.
TABLE OF CONTENTS. TABLE OF CONTENTS Not Possible in single computer and DB Serialised solution not possible Large data backup difficult so data.
Big Data Intro.
Rahi Ashokkumar Patel U
Hadoop Clusters Tess Fulkerson.
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024Low Power Wide Area Network.
Hadoop Market
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
A Must to Know - Testing IoT
Chapter 1 Database Systems
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
Computer Fundamentals
TRANSFORMATION OF INFORMATION TECHNOLOGY IN INSURANCE BUSINESS MODELS
Overview of big data tools
AIMS for BizTalk, Built on the Microsoft Azure Platform, Empowers Enterprises to Automate Insight and Analytics and Boost Value Creation MICROSOFT AZURE.
Big Data Young Lee BUS 550.
FileFacets Information Governance Solution Performs High-Quality Automated Enterprise Content Management Migration, Built on Azure MICROSOFT AZURE APP.
Improve Patient Experience with Saama and Microsoft Azure
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Zoie Barrett and Brian Lam
SERVER INNOVATION ACCELERATES IT TRANSFORMATION
Big Data Analysis in Digital Marketing
School Districts Can Analyze and Report on Data Across Multiple Systems with EdWire, a Powerful Integration Solution that Utilizes Microsoft Azure MICROSOFT.
Pitch Deck.
Big Data.
Presentation transcript:

Dep. of Information Technology By: Raz Dara Mohammad Amin Ishik University Faculty of Science Dep. of Information Technology By: Raz Dara Mohammad Amin

What is BIG DATA? Insights from big data can enable all employees to make better decisions—deepening customer engagement, optimizing operations, preventing threats and fraud, and capitalizing on new sources of revenue. Big data is being generated by everything around us at all times.

What is BIG DATA? It is creating a culture in which business and IT leaders must join forces to realize value from all data. Insights from big data can enable all employees to make better decisions—deepening customer engagement, optimizing operations, preventing threats and fraud, and capitalizing on new sources of revenue. But escalating demand for insights requires a fundamentally new approach to architecture, tools and practices.

What is BIG DATA? More than 300 Petabyte daily! A petabyte (PB) is 1015 bytes of data (or 1,000 TB)

Why are we interested in BIG DATA? Data is the new “OIL” resource, difficult to extract, it comes in many types, difficult to refine However, it if obtained it has the potential to profoundly affect every aspect of our lives.

Why are we interested in BIG DATA?

Why are we interested in BIG DATA? Driving new insights from previously untouched data & integrate that insight into current processing. The application of new tools to do more analytics on more data for more people.

Characteristics of BIG DATA We need to understand the characteristics of BIG DATA first.

Characteristics of BIG DATA Volume: data will be x50 in 2020 (compared to 2010) The 4 V’s Velocity: data is generated at much higher rates as more devices are becoming digital. Variety: 80% of the data is unstructured having different forms. Veracity: accuracy or conformity of facts is of great importance.

BIG DATA Analytics: The Picture The major software components and their role

BIG DATA Echosystem: Basics Growing storage problems. Handling massive quantity of structured and unstructured data. Cost of processing. Fault tolerance. Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment. It uses commodity hardware (PCs connected to network). Supports distributed file storage via HDFS with replication. Processing with fault tolerance.

Storage & Processing ! Size limited by the size of local storage and processing power is bound by the CPU capabilities. Cost, Feasibility, Advantage?

Storage & Processing Shared distributed storage , distributed processing

Hadoop, HDFS and others?

Hadoop, HDFS and others? * ACID (Atomicity, Consistency, Isolation, Durability)

BIG DATA & Computer Science Data Structures. Programming Languages (Java, Python, Scala, R) Artificial Intelligence (Data Mining, NLP, Machine Learning) Networking. Etc.

Who are the players?

Thank You