Download presentation
Presentation is loading. Please wait.
1
Methodology & Current Results
Rui Wu
2
Overall Goal & Motivation Solutions Parameters Conclusion
3
Goal & Motivation Goal: Web-based application to calculate complex network parameters Motivations: Dataset size larger than local machine hard drive Device computing power is low--mobile phone Solutions: client-server structure & Apache Spark GraphX & Apache Spark Matrix
4
GraphX Three programming language to choose: Java, Scala, and Python--Scala (latest version & Speed) GraphX: Using Apache Spark, useful APIs, need to be customized (matrix operations)
5
Graph Loader .net file (directed graph) Parse
Create RDD (immutable and multi-machines) Create vertex: id and name Create edge: start, end, and weights Create graph Four days… Scala is so strange E.g. no “continue” loop, latest version has “break” 2004, still young? Screenshot of .net for this one
6
Example--Nexusanon.net
7
Degree Distribution & Average Degree
Not hard Use map and reduce to loop through every node, get every node degree information. Apache Spark RDD: don’t use for or while
8
Example--Nexusanon.net (degree, number of nodes) Output into files
Restful APIs-->JSON file Front end visualization
9
Hop Distribution A is the source: 1 hop: 2, 2 hops: 3…
A to other nodes length (need to program by myself) [[B, 1],[C,1],[C,2],[D,2],[D,3],[D,4],[E,2],[E,3],[F,3],[F,4],[F,5]] Two has 1 as last element: [B, 1],[C,1], therefore, 2 Three has 2 as last element: [C,2],[D,2],[E,2], therefore, 3 ...
10
Centrality Mistake: Only betweenness centrality and harmonic centrality--third party Degree centrality: easy to do, similar steps introduced in degree distributions Closeness centrality Eigenvector centrality
11
Closeness Centrality Shortest path from A to all other nodes
Shortest path Example from Stackoverflow ( between-two-vertices)
12
Eigenvector Centrality
Challenging: convert graph into matrix, matrix operations, need to program by myself Try my best to finish this
13
Interesting Question All these are not very hard, why there is no GraphX application, like Pajek? Personal guess: Cluster Set up, not hard but annoying Terminal, script Solutions: web-based application + good ReadMe file
14
Web-based application
Why Guarantee good performance, even the client side bad performance Restful APIs easy to build and efficient Good Readme file, step by step setup server Challenge: upload file Small file: upload directly Big file: url, server download by itself
15
Conclusion Solved: Unsolved: Web-based application Graph loader
Degree distribution Unsolved: Hop distribution Degree centrality Closeness centrality Eigenvector centrality Web-based application
16
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.