Download presentation
Presentation is loading. Please wait.
Published byMarvin Lyons Modified over 6 years ago
1
Big Data Caching for Networking: Moving from Cloud to Edge
Engin Zeydanand, Ilyas Alper Karatepe, and Ahmet Salih Er are with Turk Telekom; Ejder Ba¸stug˘ is with CentraleSupeléc; Manhal Abdel Kader is with Amadeus IT Group; Mehdi Bennis is with the University of Oulu; Mérouane Debbah is with Huawei France R&D. 藍: Turk Telekom 綠: Huawei France R&D. Abdel Kader: Amadeus IT Engin Zeydan, Ejder Ba,stu˘g, Mehdi Bennis, Manhal Abdel Kader, Ilyas Alper Karatepe, Ahmet Salih Er, and Merouane Debbah
2
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
3
Introduction Wireless data traffic is experiencing tremendous growth due to pervasive mobile devices, ubiquitous social networking, … et cetera, compelling mobile operators to look for innovative ways to manage their increasingly complex networks and scarce backhaul resources. The major driver of this backhaul problem is wireless video on demand traffic in which users access contents whenever they wish in an asynchronous fashion.
4
Introduction Mobile cellular networks are moving toward fifth generation(5G) wireless networks which have a few new technique[5]. In contrast to the base-station centric architectures designs for dump mobile terminals where requests are satisfied in reactive way, 5G will be user-centric, context-aware, and proactive in nature. 5G: Ultra-dense networks, massive multiple-input multiple-output(MIMO), millimeter-wave communication, edge caching, and device-to-device communications.
5
Introduction Big data brings about a new kind of information set to network planning which can be interconnected to get a better understanding of users’ behavior and network characteristics. This work investigates the exploitation of big data in mobile cellular networks from a proactive caching point of view, proposing a proactive caching architecture. 為了因應 surge of social and mobile applications 以及 5G的特性
6
Introduction Real-world example:
Process a large amount of data collected on a big data platform from one of the major mobile networks in Turkey with 17 million subscribers. These traces are collected from several BSs in hours of time intervals.
7
Introduction Focus on Hadoop-based big data processing platform inside a mobile operator’s core network in order to validate the performance gains of caching with real data trials. By using tool from machine learning to predict content popularity, further improvements in user’s quality of experience and backhaul offloading. Hadoop:一個叢集系統(cluster system)能夠儲存並管理大量資料的雲端平台,可以同時儲存、處理、分析
8
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
9
Big data analytics for 5G: Trends
Backhaul intra-traffic become larger than the inter-traffic between backhaul elements and end users. In fact, interactions of user terminals trigger various interactions with hundreds of servers, routers, and switches inside the backhaul and core network. This is contrary to the traditional carrier network architecture, which assume client and wireless access nodes as bottlenecks. 雖然2G到4G無線通訊的能力被提升了許多,但cellular networks 的 backhaul connection 沒有相應的進步 但其實 Backhaul connection在通訊傳輸上扮演了非常重要的腳色,事實上,(上方投影片內容) ,舉例來說: 一個 original user’s HTTP request 1KB的資料量,則內部的data traffic 可能會高達930倍。
10
Big data analytics for 5G: Trends
Managing this big-data-driven networks in cloud environments is a pressing issue. Mobile edge computing is yet another emerging technology where edge devices provide cloud-computing-like capabilities within radio access network to carry out functionalities such as communication, storage, and control.
11
Big data analytics for 5G: Challenge
However, 5G networks that deploying distributed cloud computing capabilities near to each BS site may also increase the deployment cost. Moreover, for modeling and prediction of spatio-temporal users’ behavior in user-centric 5G networks, network traffic arriving to a centralized location needs to be scaled out horizontally across servers and racks.
12
Big data analytics for 5G:meets caching
Allowing Hadoop’s distributed data processing engine for analyzing users’ behavior from enormous amounts of streaming data as well as exploiting proactively caching strategic contents at edges. Ease the backhaul traffic and improve users’ quality of experience by latency reduction.
13
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
14
Big data aided cache enabled architecture
Motivated by highly predictable human behavior the proposed collects contextual information(e.g. users’ viewing history and location information) and predicts users’ spatio-temporal demand to proactively and judiciously cache selected contents at the network edge.
16
Big Data Aided Cache-enabled BSs
Assume a small cell network composed of: N small cells, Backhaul link capacities 𝐶 𝑛 , Wireless link capacities 𝐶′ 𝑛 Assume 𝐶 𝑛 < 𝐶′ 𝑛 reflecting a limited backhaul capacity scenario. A set of users are requesting a total number of D contents during T time duration from a library of 𝐹= 1, …, 𝐹 where each content of 𝑓 in this library has a size of 𝐿 𝑓 with 𝐿 𝑚𝑖𝑛 <𝐿 𝑓 < 𝐿 𝑚𝑎𝑥 and a finite bit rate requirement of 𝐵 𝑓 during its delivery.
17
Big Data Aided Cache-enabled BSs
To offload the capacity-limited backhaul, small BSs are equipped with finite storage capacity and cache a subset of contents from library F. A joint optimization of content popularity matrix (denoted by P) 𝑃= 𝑝 11 ⋯ 𝑝 1𝑛 ⋮ ⋱ ⋮ 𝑝 𝑘1 ⋯ 𝑝 𝑘𝑛 ; Columns: contents , Rows: users or BSs (depends on scenario) Content cache placement at specific small cells are required. 接著需要 popularity matrix and content cache placement
18
Big Data Aided Cache-enabled BSs
Assuming that this intractable cache placement can be handled with greedy or approximate approaches[11,12]. [11] M. Ji, G. Caire and A. F. Molisch, “Wireless Device-to-Device Caching Networks: Basic Principles and System Performance,” IEEE JSAC, vol. 34, no. 1, Jan. 2016, pp. 176–89. [12] K. Poularakis et al., “Caching and Operator Cooperation Policies for Layered Video Content Delivery,” Proc. IEEE INFOCOM’16, San Francisco, CA, Apr
19
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
20
Big data platform for analysis
Store users’ data traffic and extract useful information for proactive caching decisions. 4 requirements of this platform analysis: Huge Data Volume Processing in Less Time. Cleansing, Parsing, and Formatting Data. Data Analysis. Statistical Analysis and Visualizations.
21
Big data platform for analysis
Huge Data Volume Processing in Less Time: Mirroring the data streaming interface through network analyzing tools, the collected raw data need to be exported into a big data storage platform.
22
Big data platform for analysis
Cleansing, Parsing and Formatting Data: First: raw data need to be cleaned. Second: extract the relevant fields from the raw data itself. Third: parsed data needs to be encoded accordingly for appropriate storage inside the platform. 再利用任何machine learning algorithm 之前,data 都必須是要乾淨且可以使用的. 因為 raw data 本身可能會有一些attribute是空的或者是錯誤的
23
Big data platform for analysis
Data Analysis: Using the formatted data in the platform, different data analytics techniques can be applied over header or/and payload information of both control and data planes using high level query languages such as Hive Query Language(Hive QL) and Pig Latin. Find relationships between control and data packets. 我們可以使用 high level query language 如 Hive …. 跟 Pig Latin 去實現各種分析的方法取得我們想要得到的結果,在這裡他們是希望可以找到 control 跟 data packet 之間的關係 (Mobile Subscriber Integrated Services for Digital Network Number (MSISDN) of users (which are present in control packets but not in data packets) to the requested content (which are present only in data packets) through successive map-reduce operations.)
24
Big data platform for analysis
Statistical Analysis and Visualization: After analysis is done to predict the spatio-temporal user behavior for proactive caching decisions, the result can be visualized in to graphs and tables to be represented for easier understanding.
25
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
26
Case Study & Experiment
Platform: Platform based on Hadoop, composed of Cloudera’s Distribution Including Apache Hadoop(CDH4) on four nodes. Each node empowered with an Intel Xeon CPU E running at 2.6GHz, 32 cores, 132 GB RAM, and 20 TB hard disk. A fast speed of 200 Mb/s at peak hours (network bandwidth).
27
Case Study & Experiment
Data: 10 regional core areas in Turkey. 15 billion packets in the uplink and over 20 billion packets in the downlink daily (equivalent to almost 80TB of total data.) The traffic of approximately 7 hours from ( 12 p.m. to 7 p.m. on Saturday, 21 March 2015) is collected.
28
Case Study & Experiment
Post-processing: Final-traces table/file which includes arrival time (FRAME-TIME), requested content (HTTP-URL), content size (SIZE), is obtained.
29
Case Study & Experiment
Numerical setup: Assume that D contents are requested from the processed data over a time interval of 6 h 47 min. FRAME-TIME, HTTP-URL and SIZE are also taken. The requests are pseudo-randomly assigned to N BSs. The list of simulation parameters are summarized in Table 1.
31
Case Study & Experiment
Two method are examined in the numerical setup: Ground truth: The P matrix is constructed by considering all available information in the final-traces table. Collaborative filtering: picked uniformly at random for training of P matrix estimation. Then, the remaining missing entries/ratings in the traces are predicted via the regularized SVD. Caching strategic contents: Storing the most popular contents greedily at the small BSs until no storage space remains[1]. SVD: singular value decomposition Collaborative filtering:用相似統計的方法得到具有相似愛好或者興趣的相鄰使用者 [1] E. Bastug et al., “Big Data Meets Telcos: A Proactive Caching Perspective,”J. Commun. and Networks, Special Issue on Big Data Networking-Challenges and Applications, vol. 17, no. 6, Dec. 2015, pp. 549–57.
32
0%: no caching 100%: 17.7GB Satisfaction: user request satisfaction
33
Load = Popularity x Size
Ground truth outperforms the CF method since it has complete information of the content ratings.
34
Ratio: 𝑛 𝐶 𝑛 𝑛 𝐶′ 𝑛 Evolution of users’ request satisfaction with respect to the backhaul capacity ratio Increasing the backhaul link capacity yields higher satisfaction in both the ground truth and CF approaches.
35
RMSE: root mean square error, 表示CF 跟 ground truth 的誤差
Train 越多次 CF 與 ground truth 差越小
36
Outline Introduction Big data analytics for 5G: requirements, Challenges and benefit Big data aided cache enabled architecture Big data platform for analysis Case study & Experiment Conclusions
37
Conclusions Introduced a proactive caching architecture for 5G wireless networks by processing a huge amount of available data on a big data platform and leveraging machine learning tools for prediction. Future work: investigate the proposed big data analysis framework in a real-time fashion.
38
Thank you for your listening.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.