Big Data, Learning Analytics and Education Aleksanda Klašnja-Milićević Mirjana Ivanović.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
DATA WAREHOUSING.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Brenda Woods John Williams Daniel Bailey Breia Stamper.
Amadeus Travel Intelligence ‘Monetising’ big data sets
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Module 3: Business Information Systems
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
Ch 4. The Evolution of Analytic Scalability
IT – DBMS Concepts Relational Database Theory.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
USING HADOOP & HBASE TO BUILD CONTENT RELEVANCE & PERSONALIZATION Tools to build your big data application Ameya Kanitkar.
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Data Warehouse & Data Mining
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
` tuplejump The data engineering platform. A startup with a vision to simplify data engineering and empower the next generation of data powered miracles!
Introduction to Hadoop and HDFS
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
An Introduction to HDInsight June 27 th,
Information Systems & Enhancing Decision Making for the Digital Firm
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Pertemuan 16 Materi : Buku Wajib & Sumber Materi :
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
IoT Meets Big Data Standardization Considerations
Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.
Types of Information Systems Basic Computer Concepts Types of Information Systems  Knowledge-based system  uses knowledge-based techniques that supports.
Chapter 1 DECISION SUPPORT SYSTEMS AND BUSINESS INTELLIGENCE Skip subsections: 1.1, 1.2, 1.8, 1.10.
Big Data Yuan Xue CS 292 Special topics on.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
SAP BI – The Solution at a Glance : SAP Business Intelligence is an enterprise-class, complete, open and integrated solution.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Managing Data Resources File Organization and databases for business information systems.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Platform and Analytics Foundational Training
SNS COLLEGE OF TECHNOLOGY
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehousing and Data Mining
Ch 4. The Evolution of Analytic Scalability
Overview of big data tools
Supporting End-User Access
Big Data Young Lee BUS 550.
Charles Tappert Seidenberg School of CSIS, Pace University
Big DATA.
Customer 360.
UNIT 6 RECENT TRENDS.
Big Data.
Presentation transcript:

Big Data, Learning Analytics and Education Aleksanda Klašnja-Milićević Mirjana Ivanović

Introduction  Big data - large, massive volume of data difficult to be processed by traditional and standard software and database techniques.  Term - Web search companies - extract useful information from extremely large distributed aggregations of loosely-structured data.  Big data - great potential for improving business in companies, healthcare, e-learning, and in wide range of data-driven industries and provide support for more intelligent decisions. 2

Definitions  Understanding and definition of big data:  depends not only on a massive amount of data but  also have to consider what the technology and which size of big data that technology could handle  2001 by Laney - ‘Big data can be characterized by the three Vs:  ‘volume’ - very large volume of data (expressed in terms of terabytes, records, transactions, tables, files),  ‘velocity’ - rapid generation (expressed in terms of batch, near time, real time, streams) and  ‘variety’ - various modalities (expressed in terms of structured, unstructured, semi-structured, or of all mentioned)’. 3

Definitions  In 2011, an International Data Corporation (IDC) - ‘big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling the high- velocity capture, discovery, and/or analysis.’  According to this big data are summarized as four Vs:  Volume (very large volume of data),  Variety (various modalities),  Velocity (rapid generation) and  Value (huge value but very low density). 4Vs definition has been widely accepted as it indicates serious problem in big data: how to find out values from datasets with a massive scale, diverse types, and hasty generation. 4

Big data analytics  It is the process of collecting, organizing and analysing large sets of data with intention to discover patterns of data and some other useful and important information.  An instrument to discover the knowledge that is behind the analysing data.  To analyse such a huge volume of data, big data analytics have to use specialized:  high-performance analytics techniques and tools for data mining,  data optimization,  predictive analytics,  forecasting and so on. 5

Trends in Contemporary Education Environments  Higher education institutions - more data than ever before.  Analysis of the amount of learning data:  which learners are at risk of dropping out or need additional support to increase their success, and confidence, in the learning process.  new or novel approaches are required to understand the patterns of value that exist within the data.  Lot of exploration and researches aim to handle the data with the proper techniques and new tools to produce real time solution, prediction in this certain area. 6

more effective self-learningimproving learners experience and knowledgemore effective evidence-based decision makingstrategic response to change global trendsenable teachers to identify intercessions,create useful peer groups andfree up class time for creativeness and problem solving  Big data can advance higher education practice: Trends in Contemporary Education Environments 7

future uses of learning sequences on final learners’ grades students’ knowledge behavior Prediction discover important relationships between students and students’ scores Structure discovery discover relationship between the usability of the course materials and the students’ learning performances Relationship mining distil data in different ways and for different purposes for further use in human management Distillation of data 8

General process of mining insights from big data  Typical measurements:  cover time spent,  number of logins,  number of mouse clicks,  number of accessed resources,  number of finished coursework, etc. Data Information Knowledge Practical value Collecting Classification Summarizing Synthesizing Evaluation Desicion making 9

Capabilities of an e-learning system Major capabilities of an e-learning system Functionality Interactivity Response time The e-learning system is designed to allow access to the system at remote locations, providing anytime and anywhere access to course content, which is critical for promoting the use of e-learning systems 10

Why Associate Big Data and Learning Analytics Together? 11

Big data signifies the interpretation of a wide array of administrative and effective data. Recognize potential concerns related to academic programming, research, teaching and learning. Progress in order to predict future performance. Collected processes designed at estimating institutional performance. Benefits of Big Data and Learning Analytics 13

Educational performances Feedback: predict learner outcomes such as: dropping out, needing extra help, or being capable of more demanding assignments. Motivation: If the big data is appropriately implemented, learners possibly become invested in entering data to the process because they understand the impact of how it works. Personalization: allowing designers to personalize courses to adjust their learners’ individual needs. developers to promote the standard for effective and exceptional e-learning courses. 14

Efficiency: save many hours of time and effort when it comes to the achievement of our goals and strategies that we need to achieve them. Collaboration: cooperation, teamwork, and interdisciplinarity thought processes. Tracking: teachers to understand the real patterns of learners more effectively by allowing them to track a learner’s experience in an e- learning course. Understanding the learning process: which parts of a course or exam were difficult or easy pages revisited often, sections recommended to peers, preferred learning styles, and the time of day when learning operates at its best. Educational performances 15

The Application Development Framework  For any big data platform - an application development framework which can make simpler:  the process of development,  execution,  testing, and debugging appropriate educational software components  and building blocks.  Such type of framework should contain:  model development tools and techniques;  capability for program loading, implementation, and process scheduling;  the system configuration and management tools. 16

Several technologies to handle big data  Map-Reduce,  Hadoop,  NoSQL,  PIG and  Hive.  Databases for managing data of this size are generically known as NoSQL databases - “not only SQL".  NoSQL databases are simpler than SQL databases in terms of data manipulation abilities, but can handle larger amounts of data. 17

Several technologies to handle big data  One of the most known and significant tools commonly used by most data management systems is Hadoop.  Hadoop is an open-source software framework for storage and processing big data in a distributed fashion on large clusters of commodity hardware, using simple programming models.  Principally, it achieves two tasks: large data storage and faster processing. 18

Several technologies to handle big data  Hive  Data warehouse framework built on top of Hadoop  Used for ad hoc querying with an SQL type query language - more complex analysis  Designed for batch processing, not online transaction processing  Does not offer real-time queries  Pig  High-level data-flow language (Pig Latin) and  Execution framework whose compiler produces sequences of Map- Reduce programs (for execution within Hadoop)  Each Pig Latin program consists of multiple steps, each step being a type of query  Support ad-hoc data analysis  Data can be queried directly without the need for importing into tables 19

several technologies to handle big data  Mahout  project for building scalable machine learning libraries, with most algorithms built on top of Hadoop and using the Map- Reduce paradigm  HBase  distributed, scalable, big data store that runs on top of HDFS  provides random, real time access to big data  was created for hosting very large tables with billions of rows and millions of columns  Tables in HBase can serve as the input and output for Map- Reduce jobs run in Hadoop  HBase is very different from traditional relational databases, like MySQL, PostgreSQL, Oracle, etc. 20

The architectural design of ecosystem  The general principles of system design undertake the following requirements: Support for large volumes and multi-structured data sets. Platform independent deployment on the learner side. An easy-to-use learner interface. Incorporated analytic modules that allow learners to complete course quickly and answer their own questions. Embedded statistical functions as well as allow custom statistical codes. User-friendliness, high-level flexibility, and scalability by using high performance computing and cloud computing resources. 21

Architecture framework Data capture and collection module maps, aggregates and cleans data from different sources and prepares data for ETL (Extractions, Transformations, and Loading) process. The ETL module includes three main functions: data integration to relevant tables, data transformation and loading of data specific for analysis. The ETL process populates the databases with data that is analysis ready allowing more confident analysis. HADOOP the different data sources are managed in the Hadoop platform and are stored in a compatible repository. The Hadoop platform allows easy management of high volume and various data. The analysis engines execute standard and predefined procedures in order to enable complex statistical analysis. The presentation layer provides a user-friendly graphical interface. Learners, teachers and researchers could easily retrieve information via this interface without the need for the in-depth data analysis knowledge, programming skills, or database schema background. 23

Architecture framework 22 Analysis engine

Challenges of Implementation Big Data in Education  Financial expenses - many institutions view analytics as an expensive effort rather than as an investment.  Monitor the entire process - from defining the important questions to developing data models for designing and delivering alerts, recommendations, and reports.  Experienced database administrators and designers capable of warehousing and incorporating data through multiple files and formats are a necessity.  In addition to the expertise needed to develop databases, instructional designers working with faculty will need to understand learner behaviors that are appropriate to the application at hand. 24

Challenges of Implementation Big Data in Education  Privacy, data profiling, and the rights of learners in terms of recording their individual behaviors.  Some important issues must be taken into consideration:  Should learners be told that their activity is being tracked?  How much information should be provided to students, faculty, parents, issuers of scholarships and others?  How should faculty members react?  Do learners have an obligation to seek assistance? 25

Conclusions  All of education organizations need to have a vision for how it will take advantage of big data. From improving learners experience and knowledge trough enhanced academic studying. More effective evidence-based decision making.Strategic response to changing global trends. Promises to turn complex, often unstructured data into actionable information. Provides scientists with the opportunity to comprehend the meanings of these data and how they can be analyzed in a significant, effective, and consistent manner that contributes to both theory and practice. 26

Future work  Educational institutions - trying to balance: faculty expectations various federal privacy laws the institutions own philosophy of learner development 27

Future work  Understand the dynamic nature of academic success and retention, offer an environment for open dialogue, and improve practices and strategies to address these issues.  Interesting additional features, which are worth for further research, can be: optimizing presented architecture with other learning services, identifying additional tactics, automated them and integration with cloud platform. 28