Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2

Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2
NetAI 2018, Budapest, Hungary Adaptive Multiple Non-negative Matrix Factorization for Temporal Link Prediction in Dynamic Networks Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2 1 ICNLAB, SECE, Peking University Shenzhen Graduate School 2 Future Network Theory Lab, 2012 Labs, Huawei Technologies Co. Ltd Good afternoon everyone, I'm glad to be here to deliver this presentation. I'm Meng Qin from Peking University Shenzhen Graduate School. The title of our paper is "Adaptive Multiple Non-negative Matrix Factorization for Temporal Link Prediction in Dynamic Networks" and it's a joint work of Peking University and Huawei. (Friday)

Outline Motivation Problem Definition Methodology
Experimental Evaluation Conclusion Dynamics Prediction Task of Various Network Systems Generally Formulate！ Temporal Link Prediction Problem In this work, we generally formulate the dynamics prediction problem of various network system as the temporal link prediction task and proposed a novel link prediction model to effectively tackle such problem. My presentation include these 5 parts.

Motivation Dynamics of network systems
a significant factor that hinders systems' performance The prediction of mobility, topology & traffic an effective technique to tackle such dynamics problem e.g. mobility prediction in cellular networks [N. B. Prajapati, et al. '18] e.g. traffic prediction in data center networks [A. Mozo, et al. '18] etc. First, let me introduce the our motivation of this paper. As we know, the dynamics has been considered as a significant factor that hinders the performance of various network systems. And the prediction of network system's mobility, topology as well as traffic can be seen as an effective technique to tackle such problem. For example, in cellular networks the prediction of users' mobility can help to reduce the bandwidth resource consumption while achieving better QoS. In data center networks, traffic prediction can also help to effectively schedule the highly parallel flow. Generally, if the dynamics of network system can be accurately predicted, resources can be effectively pre-allocated to avoid performance degradation due to the resources shortage.

Motivation(Cont) Dynamics prediction of network systems
e.g. [A. Mozo, et al. '18] CNN based model forecast short-term traffic load in data center networks e.g. [L. Nie, et al. '18] deep belief neural network & compressive sensing predict traffic in wireless mesh backbone networks e.g. [N. B. Prajapati. '18] hidden Markov model predict users’ location in mobile cellular networks There are several state-of-the-arts techniques trying to predict the dynamics of different network systems. For instance, the convolutional neural network can be used to predict the short-term traffic load of data center networks. The combination of deep belief neural network can compressive sensing can be utilized to predict traffic in wireless mesh backbone networks. Users' location in cellular network can also be predicted by means of hidden Markov model. However, there remains a problem that these state-of-the-arts techniques only focus on a specific application scenario and cannot be generalized to many other different scenarios! Only focus on a specific application scenario! But cannot be generalized to other different scenarios!

Vehicle mobility networks
Motivation(Cont) Most prediction tasks of mobility, traffic & topology can be generally formulated as the temporal link prediction problem! Constructing Dynamic Networks Abstract the entities and the corresponding relations e.g. Data center networks; Vehicle mobility networks if d(vi, vj)≤δ then Aij=Aji=1 Data trans. relation si sj si sj vi vj vi vj Edge Distance Edge In fact, most of the prediction task of mobility, traffic and topology can be generally formulated as the temporal link prediction task from the view of complex network analysis. In other word, we can construct an abstracted dynamic network to describe the relation changes between each pair of entities in the network system and build a temporal link prediction model to generally tackle the dynamics prediction problem. For example, in data center we can abstract each switch as a node and the data transmission relation between a certain pair of switches can be represented as the corresponding edge. Moreover, the edge weight can be used to describe the traffic. For vehicle mobility network, we can also construct an abstracted unweighted network based on the distance between each pair of vehicle. Based on the abstracted dynamic network, we can further introduce our temporal link prediction model to tackle the dynamics prediction task of various network systems. (Traffic) (Edge Wights) Switch d(vi, vj) Node Vehicle Node Data center networks Vehicle mobility networks

(Historical net. snapshot)
Problem Definition The temporal link prediction problem (Consider undirected network with a fixed node set.) Definition: Aτ-l Aτ-l+1 Aτ-1 Aτ … Ãτ+1 Current net. snapshot Next net. snapshot (Historical net. snapshot) Before formally introducing our model, let me give a problem definition of the temporal link prediction task. Generally, we can divide a dynamic network into multiple successive static network snapshots, and the static topology of each network snapshot can be described by using the adjacency matrix. Given the previous l network snapshots and current network snapshot, the goal of temporal link prediction task is to predict the topology of next time slice. In this work, we consider the case of undirected network with a fixed node set, and we will consider the more challenging case with varied set of node in our future work. The formal definition can be described by using the following equation where f is the model we want to construct in this work. Prediction result The model Adjacency matrix

Methodology NMF-based network embedding
Non-negative Matrix Factorization (NMF) [D. Lee, Nature ‘99] At : Adjacency matrix; Xt≥0: Network represent. matrix; Yt≥0: Auxiliary matrix. Objective Function: Introduces a low-dimensional hidden space (K-dimension) Preserve the topological characteristic of single snapshot At At N×N Yt N×K XtT K×N ≈ × (K<N) Low-dim. Hidden Space Our temporal link prediction model is based on the non-negative matrix factorization framework. For a certain network snapshot, we factorize the NxN adjacency matrix into two non-negative low-dimensional NxK matrices X and Y. Such process can introduce a low-dimensional hidden space for the network snapshot, where the significant topological characteristics are preserved. This property is consisted with some network embedding model, where each node in the network can be represented as a low-dimensional vector. Here we refer Xt as the Network representation matrix as we can use each row of such matix as the corrponding node vector. While we refer Yt as the auxiliary matrix. And the objective function of such NMF problem is as follow.

Methodology(Cont) Unified temporal link prediction model
Define NMF component t: Corresponding to a single network snapshot At Use linear combination of NMF components {τ-l, …, τ-1, τ} For the temporal link prediction task that consider multiple successive network snapshots. We first define the NMF problem of a single network snapshot At as the NMF component corresponds to time slice t. And an basic version of the unifeid model's objective function can then be obtained by combing the corresponding NMF components together. In the basic version of the model, each NMF component has a respective auxiliary matrix Y, which can encode the hidden information of single network snapshot. While all the NMF components combined share one network representation matrix X, which can encode the hidden information of the whole dynamic network. Respective Yt for single NMF comp. Encodes the hid. info. of each single network snapshot Shared X for all NMF Comp. Encodes the hid. info. of the dynamic network (with mult. successive snapshot)

Methodology(Cont) Unified temporal link prediction model (Cont)
Use ρt∈[0, 1] to adjust NMF components t’s contribution Assumption about the time factor Utilize weighted exponential decaying penalty More contribution Less contribution Aτ-l Aτ-l+1 Aτ-1 Aτ … Ãτ+1 Exponent penalty for time factor More importantly, the time factor of the dynamic network should also be taken into account. Here, we use the parameter ryo_t to control the relative contribution of NMF component t in the unified model. And we adopt the reasonable assumption that the NMF components close to current snapshot should have more contribution than those far from it. Then, we utilize the following weighted exponential decaying penalty to reflect the time factor of the dynamic network. Param. to control NMF comp. t

Methodology(Cont) Unified temporal link prediction model (Cont)
Reduce the complexity / Number of param. Introduce the adaptive parameter ρt =ρt(X, Xt) Adaptively control the NMF comp. t Sim. between single snap. & the dynamic net. Euclidean dist.-induced similarity Adaptive param. There remains a problem that ryo_t for different NMF components may still result in much more complexity, as we need to adjust the parameter for each component. In order to effectively reduce this complexity, we modify the fixed parameter ryo_t into a function of X and Xt, and such modification enables it to adaptively set the contribution of the corresponding NMF component in the unified model. Here, we refer modified ryo_t as the adaptive parameter and its definition is as follow. In the model , the network representation matrix X is shared by all the NMF components which encodes the hidden information of whole dynamic network. While Xt is the solution of the corresponding NMF component, which encodes the hidden information of single network snapshot. We can use the similarity between X and Xt to control the NMF component t. Under such circumstance, the lager similarity means the more contribution of the corresponding the NMF component in the unified model. Here, we adopt the Euclidean distance-induced similarity as the metric. Hidden info. of the dynamic network Hidden info. of net. snapshot t (Max-min normalization)

Methodology(Cont) The Adaptive Multiple NMF. (AM-NMF) model
Objective function(in the s-th iteration): Properly initialize {X,Yτ-l,…,Yτ} Utilize certain rules to continuously update {X,Yτ-l,…,Yτ} until converge Initialization X(0)←Xτ* (Solution of NMF comp. τ) Yt(0) ←Yt* (Solution of NMF comp. t) For ada. param. & init. of X The final version of the objective function is as follow. Each NMF component form tau minus l to tau minus one has a respective adaptive parameter, And we also add another parameter alpha to comprehensively control the combination of multiple NMF components. For most of the NMF-based unified model, the general solving strategy is to appropriately initialize the unknown quantities and use certain updating rules to continuously update their values. In this work, we utilize the solution of each single NMF components to initialize the unknown quantities. For the solution of a certain NMF component, we utilized the optimal auxiliary matrix Yt to initialize the corresponding variable in the unified model, while the network representation matrix Xt can be used to calculate the corresponding adaptive parameter. Especially, we use the solution X_tau to initialize the shared X in the unified model. For init. of {Yt(0)} NMF Comp. t (τ-l ≤ t ≤ τ)

Methodology(Cont) The AM-NMF model (Cont)
Construct the prediction result Ãτ+1 Conduct the inverse process of NMF from the hidden space The base method Directly use the solution {X, Yτ} Without additional parameters Katz-refining (KR) Use Katz-index [Leo Katz. ‘53] to refine the result Need to adjust param. {β, θ}; After we have obtained the solution of the AM-NMF model, the prediction result of next network snapshot can be constructed by conducting the inverse process of NMF from the hidden space. Here, we introduce the two alternative methods, which are the base method and Katz-refining method. In the base method, we can obtain the prediction result by directly utilizing the results of Y_tau and X without adjusting other parameters. Furthermore, the Katz-refining method consider more hidden information learned by the model and use the Katz-index to enhance the similarity between each pair of nodes. But additional effort should be taken to adjust the parameter beta and theta to achieve better prediction performance.

Methodology(Cont) Superiority of the AM-NMF model
Reduce the number of param. to be adjusted ρt adaptively adjust the contribution of NMF comp. t With only one param. α to be adjusted Consider the intrinsic correlation Between single net. snap. & dynamic net. Can be easily extended to a hybrid network embedding model That integrates other heterogeneous info. e.g. higher-order topology (i.e. motif) [J. Leskovec et al. Science '16] e.g. node attributes Only need to determine the corresponding NMF comp. As a brief conclusion, the superiority of our AM-NMF model is three-folds. First, the introduction of the adaptive parameter can effectively reduce the number of parameters to be adjusted. The relative contribution of different NMF components can be automatically determined, and we only need to adjust one parameter alpha for the model. Second, the adaptive parameter can also help the unified model to further explore the intrinsic correlation between each single network snapshot and the dynamic network, resulting in better prediction performance. Moreover, AM-NMF can be easily modified into a hybrid network embedding model that integrate other heterogeneous information, such as higher-order topology motif and node attribution. Under such circumstance, we only need to determine the corresponding NMF components, and the relative contribution of different information can then be automatically adjusted.

Methodology(Cont) The AM-NMF Algorithm
For current time slice τ Assume solutions of previous NMF comp. {τ-l, …, τ-1} have been saved standard NMF process [D. Lee, Nature ‘99] Input: {Aτ-l, …,Aτ}, {Xτ-l*,Yτ-l*,…,Xτ-1*,Yτ-1*} Get the solution of NMF comp. τ {Xτ*,Yτ*} To partly avoid local minima solution Take such init. step at least 10 times Initialize {X,Yτ-l, …,Yτ} Now we summarize the algorithm of the AM-NMF model to conduct one temporal link prediction process. Typcially, when the system comes to a new time slice, we can obtain the solution of the new NMF component and save the result as it will be later reused again. Namely, we assume that the solution of previous NMF components has been obtained and saved. We only need to solve NMF problem with respective to current NMF component for the initialization step. In our model, each single NMF component corresponds to a standard NMF process. For the solution of the NMF component tau, We can randomly set the value of X_tau as well as Y_tau and directly use the following two updating rules to continuously update their values until converge. As such solving strategy cannot ensure to obtain global optimal solution, we recommend to take such step at least 10 times, and choose the best result with minimum value of objective function. After finishing the initialization, we can alternatively conduct the defined X-Process and Y-Process to update the unknown quantities until converge Finally, the prediction result can then be obtained by conducting the inverse process of NMF. Alternatively conduct Y-Process and X-Process until converge Output: Ãτ+1

Partial derivative with respect to Yt
Methodology(Cont) The AM-NMF Algorithm(Cont) Y-Process: Update Yt (τ-l ≤ t ≤ τ) with {Yp, X} (p ≠ t) fixed Obj. function Partial derivative with respect to Yt Addictive rule (Gradient Descend) In the Y-Process, we utilize a certain updating rules to update each auxiliary matrix Yt with other variables fixed. We use a gradient descending-based approach to derive the updating rules. And we modify the corresponding additive updating rule into a multiplicative form, which can converge much faster. For more details about the derivation, you can refer to our paper. Multiplicative rule

Multiplicative updating rule
Methodology(Cont) The AM-NMF Algorithm (Cont) X-Process: Update X with {Yt} (τ-l ≤ t ≤ τ) fixed Obj. function Partial derivative with respect to Yt In the X-Process, we use the derived rule to update the shared X in the unified model with other variables fixed. And we derive the updating rule similar to the Y-Process. Multiplicative updating rule

Experimental Evaluation
Datasets KAIST: Human mobility network (position) BJ-Taxi: Vehicle mobility network (position) UCSB: Wireless mesh network (link quality) NumFab: Center data network (flow) Preprocess – Extract multiple successive network snapshots Unweighted network: KAIST, BJ-Taxi Calculate distance between each pair of nodes; Aij=Aji=1, if dij=dji ≤ δ; Weighted network: UCSB, NumFab Use link quality/flow as the corresponding weighted edge. N: num. of nodes; T: num of time slices K: dim. of hid. space we set We apply the temporal link prediction model to four real and simulation datasets of different network systems. In the 4 datasets, KAIST and BJ-Taxi are the position datasets of human mobility and vehicle mobility. UCSB is link quality dataset of wireless mesh network, while NumFabric is a simulation flow dataset of data center network. We pre-process each dataset to extract multiple successive network snapshot. During the pre-processing, we abstracted KAIST and BK-Taxi as unweighted networks, while we abstracted UCSB and NumFabric as weighted networks.

Experimental Evaluation(Cont)
Parameter Analysis Convergence of the AM-NMF algorithm with varied param. ρt Prediction process of one net. snpashot of KAIST (Uiformly set l = 10) Variation cure of (AM-NMF) obj. function & all the adaptive param. Conv. Curve of the (AM-NMF) Obj. Function Conv. Curve of the Adaptive Param. We first use the datasets to explore the effect of the adaptive parameter. Different from other NMF-based unified model, as the value of each adaptive parameter may vary during the iterative updating process, the convergence of our solving strategy need to be specially discussed. We use the prediction process of one network snapshot of the KAIST dataset as an example. In each updating iteration, we recoded the value of the model’s objective function and all the adaptive parameters. According to the result, the objective function’s value dramatically decrease in the first serval iterations and quickly converge, which means the convergence of our solving strategy can still be ensured even for the non-fixed parameters. On the other hand, the values of all the adaptive parameters slightly change in the first iterations and keep stable in the rest iterations. Such result can also verify the effect of the adaptive parameter, as different NMF components in the unified model may have different contribution.

Performance Evaluation Evaluation Metric Prediction of unweighted networks Area Under the ROC Curve (AUC) (ROC: Receiver Operating Characteristic) Prediction of weighted networks Avg. Error Rate / Mean Square Error (MSE) True Positive Rate False Positive Rate Finally, we evaluate the performance of the AM-NMF model on the 4 pre-processed datasets for the prediction tasks of both unweighted networks and weighted networks. In the experiment, we use the AUC metric to evaluate the prediction performance of unweighted networks, while we utilized the average error rate as the metric for the prediction of weighted networks.

Performance Evaluation (Cont) Prediction of unweighted networks Datasets: KAIST, BJ-Taxi Metric: (Avg.) AUC AM-NMF with Katz-refining (KR): best AM-NMF without KR: second-best Prediction of weighted networks Datasets: UCSB, NumFab Metric: (Avg.) MSE AM-NMF without KR: best Comparative methods Collapsed network based: WCT, ED, SVD NMF based: SNMF-FC, GrNMF In the evaluation, we utilized 5 conventional and state-of-the-art temporal link prediction methods to be the comparative approaches. For the link prediction of unweighted networks, the AM-NMF method with Katz-refining achieve the best performance and the our method without Katz-refining performs the second-best on the two testing datasets. On the other hand, for the prediction task of weighted networks, although the Katz-refining effect seems not to work. Our method still have the best performance on the two testing datasets.

Conclusion Generally formulate the dynamics prediction of network systems As the temporal link prediction problem Propose a novel AM-NMF model Introduce adaptive param. to reduce number of param. to be adjusted Consider the intrinsic correlation between single snapshot and dynamic net. Can be extended to a hybrid network embed. model integrates other info. e.g. higher-order topo. (motif), node attribute Derive the solving strategy with ensured convergence Finally, let me give a brief conclusion of this paper.

Conclusion(Cont) Future work
The non-linear characteristics of dynamic networks The effect of window size l The sampling frequency of network snapshots The challenging case with varied node set ! For weighted dynamic networks Wide-value range problem of edge weights Sparsity of edge weights

Meng Qin (megnqin_az@foxmail.com)
NetAI 2018, Budapest, Hungary Adaptive Multiple Non-negative Matrix Factorization for Temporal Link Prediction in Dynamic Networks Thank You Very Much! Q&A This is the end of my presentation, and I'm happy to answer your questions. Meng Qin

Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2

Similar presentations

Presentation on theme: "Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2

Similar presentations

Presentation on theme: "Kai Lei1, Meng Qin1, Bo Bai2,*, Gong Zhang2"— Presentation transcript:

Similar presentations

About project

Feedback