Telco Big Data (TBD) Example:

Slides:



Advertisements
Similar presentations
AirPlace Kyriakos Georgiou Athina Paphitou Maria Christodoulou
Advertisements

C-Store: Self-Organizing Tuple Reconstruction Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 17, 2009.
The JOURNEY Active Network Model Maximilian Ott et al. IEEE Journal on Selected Areas in Communications, vol.19, no. 3, March 2001.
An Interactive Visualization of Super-peer P2P Networks Peiqun (Anthony) Yu.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
Live Re-orderable Accordion Drawing (LiveRAC) Peter McLachlan, Tamara Munzner Eleftherios Koutsofios, Stephen North AT&T Research Symposium August, 2007.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Green Space Services for Local Monitoring Aratos Technologies S.A.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Panayiotis G. Andreou, George Constantinou, Demetrios Zeinalipour-Yazti, George Samaras Department of Computer Science, University of Cyprus Panayiotis.
Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)
Business Data Communications, Stallings 1 Chapter 1: Introduction William Stallings Business Data Communications 6 th Edition.
Profiles and Multi-Topology Routing in Highly Heterogeneous Ad Hoc Networks Audun Fosselie Hansen Tarik Cicic Paal Engelstad Audun Fosselie Hansen – Poster,
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
Focused Matrix Factorization for Audience Selection in Display Advertising BHARGAV KANAGAL, AMR AHMED, SANDEEP PANDEY, VANJA JOSIFOVSKI, LLUIS GARCIA-PUEYO,
Introduction to Hadoop and HDFS
Chapter © 2006 The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/ Irwin Chapter 7 IT INFRASTRUCTURES Business-Driven Technologies 7.
Content Clustering Based Video Quality Prediction Model for MPEG4 Video Streaming over Wireless Networks Asiya Khan, Lingfen Sun & Emmanuel Ifeachor 16.
Session 2a, 10th June 2008 ICT-MobileSummit 2008 Copyright E3 project, BUPT Autonomic Joint Session Admission Control using Reinforcement Learning.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Wireless Sensor Networks In-Network Relational Databases Jocelyn Botello.
An Architecture for Distributed High Performance Video Processing in the Cloud 作者 :Pereira, R.; Azambuja, M.; Breitman, K.; Endler, M. 出處 :2010 IEEE 3rd.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
Kiran Barn, PI Users Conference 2000Slide 1 Kiran Barn, Principal Engineer. Slide 1Kiran Barn, PI Users Conference 2000.
Daniele D’Agostino CNR - IMATI - Sezione di Genova
Web Log Data Analytics with Hadoop
Scalable Terrain Rendering Data Management Infrastructure Ricardo Veguilla March 7, 2007.
Students: Aiman Md Uslim, Jin Bai, Sam Yellin, Laolu Peters Professors: Dr. Yung-Hsiang Lu CAM 2 Continuous Analysis of Many CAMeras The Problem Currently.
Dagstuhl Seminar 10042, Demetris Zeinalipour, University of Cyprus, 26/1/2010 1/17 ERCIM Spring Meeting 2013, June 6, 2013, Nicosia,
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Tackling I/O Issues 1 David Race 16 March 2010.
Gorilla: A Fast, Scalable, In-Memory Time Series Database
The Application of Big Data and Learning Analytics in Universities Management: Practices from the Open University of China 魏顺平 博士 Shunping Wei May. 22nd,
Authors: Christos Stergiou Andreas P. Plageras Kostas E. Psannis
Pilot Kafka Service Manuel Martín Márquez. Pilot Kafka Service Manuel Martín Márquez.
Efficient Exploration of Telco Big Data with Compression and Decaying
MERANTI Caused More Than 1.5 B$ Damage
Organizations Are Embracing New Opportunities
Information Systems in Organizations
Authors: Jiang Xie, Ian F. Akyildiz
Towards Real-Time Road Traffic Analytics using Telco Big Data
Demetrios Zeinalipour-Yazti (Univ. of Cyprus)
S SPATE: Compacting and Exploring Telco Big Data Constantinos Costa1 , Georgios Chatzimilioudis1, Demetris Zeinalipour-Yazti2,1, Mohamed F. Mokbel3.
Steven Whitham Jeremy Woods
Algorithms for Big Data Delivery over the Internet of Things
Quality-aware Aggregation & Predictive Analytics at the Edge
SpatialHadoop: A MapReduce Framework for Spatial Data
AirPlace Indoor Positioning Platform for Android Smartphones
به نام خدا Big Data and a New Look at Communication Networks Babak Khalaj Sharif University of Technology Department of Electrical Engineering.
Spatio-Temporal Query Processing in Smartphone Networks
Spatio-Temporal WiFi Localization
FMS: Managing Crowdsourced Indoor Signals with the Fingerprint Management Studio Marileni Angelidou1, Constantinos Costa1, Artyom Nikitin2, Demetrios.
Data Lifecycle Review and Outlook
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission Vineeth Shetty Kolkeri EE Graduate,UTA.
On Spatial Joins in MapReduce
Predicting Traffic Dmitriy Bespalov.
Hadoop Technopoints.
TRANSFORMATION OF INFORMATION TECHNOLOGY IN INSURANCE BUSINESS MODELS
Big Data Young Lee BUS 550.
Comparative Evaluation of SOM-Ward Clustering and Decision Tree for Conducting Customer-Portfolio Analysis By 1Oloyede Ayodele, 2Ogunlana Deborah, 1Adeyemi.
CHAPTER 7: Information Visualization
Exploring Latent Features for Memory-Based QoS Prediction in Cloud Computing Yilei Zhang 17/05/2011.
Spatial Databases: Spatio-Temporal Databases
Big DATA.
CHAPTER 14: Information Visualization
Spatio-Temporal Histograms
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Global Hyperscale Data Center Market Size to exceed $65bn by 2024.
Presentation transcript:

Telco Big Data (TBD) Example: TBD-DP: Telco Big Data Visual Analytics with Data Postdiction Constantinos Costa1, Andreas Charalampous1, Andreas Konstantinidis1,2, Demetrios Zeinalipour-Yazti1 and Mohamed F. Mokbel3 University of Cyprus, Cyprus1 | Frederick University, Cyprus2 Qatar Computing Research Institute, Qatar and University of Minnesota, USA3 Motivation Data Decay Data Decaying refers to the “progressive loss of detail in information as data ages with time until it has completely disappeared”. M. L. Kersten, “Big data space fungus,” in CIDR’15. Benefits: Retain same data exploration capabilities. Save enormous amounts of storage and I/O. Data Prediction: aims to make a statement about the future value of some tuple using a ML model. Data Postdiction (DP): aims to recover the past value of some tuple, which has been decayed for efficiency purposes, using a ML model. Proposed in MDM’18*! The expansion of mobile networks and IoT (IP- enabled hardware) have contributed to an explosion of data inside Telecommunication Companies (Telcos) Data Postdiction (DP) The Big data era lead us to a point where organizations collect more than they can! Scale: Global volume of stored data doubling every 2 years. 44 zettabytes of data in 2020 (44 billion TB of data). Costs: Costs for data storage decline only at a rate of less than 1/5 per year. Datacenter Journal, https://goo.gl/o4MnJp. An 1 PB Hadoop cluster, using between 125 and 250 nodes, costs ~1M USD. Time: A linear scan of 1PB would require ~15 hours. Telco Big Data (TBD) Example: 5TBs/day for 10M clients (i.e., 2PB/year) soon [1] "Efficient Exploration of Telco Big Data with Compression and Decaying", Constantinos Costa, Georgios Chatzimilioudis, Demetrios Zeinalipour-Yazti, Mohamed F. Mokbel, Proceedings of the IEEE 33rd International Conference on Data Engineering (ICDE '17), IEEE Computer Society, pp. 1332-1343, April 19-22, San Diego, CA, USA, ISBN: 978-1-5090-6543-1,  2017. [2] "Towards Real-Time Road Traffic Analytics using Telco Big Data", Constantinos Costa, Georgios Chatzimilioudis, Demetrios Zeinalipour-Yazti, Mohamed F. Mokbel, Proceedings of the 11th Intl. Workshop on Real-Time Business Intelligence and Analytics, collocated with VLDB 2017 (BIRTE ‘17) , IEEE Computer Society,  pp. 5:1--5:5, August 28, Munich, Germany, ISBN: 978-1-4503-5425-7/17/08, 2017. TBD-DP: Operator Overview TBD-DP: Graphical User Interface The TBD-DP operator is implemented inside the spatio-temporal SPATE architecture [1]. The interface enables users to carry out high resolution visual analytics, without consuming enormous amounts of storage. The savings are quantified numerically with bar charts and visually with heatmaps. TBD-DP Operator Overview Construction algorithm: construct a DP-tree (B), for a percentage (f) of the historic data (D), denoted as D’. Delete D’ and retain B for data recovery. Recovery algorithm: use {B, D-D’} to recover any past data blocks, upon demand. D = full data resolution and B = model data resolution. TBD-DP Video: https://tbd.cs.ucy.ac.cy/ TBD Evaluation: Space Capacity TBD Evaluation: Control Experiments Observations: TBD-DP provides 25%, 50% better space capacity than COMPRESSION and RAW, respectively. TBD-DP outperforms the SAMPLING approach by 50% in terms of NRMSE, on average, in all datasets. COMPRESSION approach provides an optimal NRMSE = 0 (GZIP is lossless, but requires more space). TBD-DP could have been configured with a decay factor f=100% rather than f=50%, yielding even less space (i.e., decay everything, retain only model approach). Observations: TBD-DP maintains a similar storage capacity for different learning models. TBD-DP+LSTM combination clearly outperforms the other 2 combinations providing around 75% less error, on average. The Decay Factor (f) shows how aggressive the postdiction process should be. F=100% (100% of RAW data deleted - postdicted). F=0% (No postdiction – same with RAW data). Accuracy still there! * "Decaying Telco Big Data with Data Postdiction", C. Costa, A. Charalampous, A. Konstantinidis, D. Zeinalipour-Yazti and M. F. Mokbel, Proceedings of the 19th IEEE International Conference on Mobile Data Management (MDM '18), June 25 - June 28, 2018, AAU, Aalborg, Denmark, 2018. Contact: dmsl@cs.ucy.ac.cy https://dmsl.cs.ucy.ac.cy/