Download presentation
Presentation is loading. Please wait.
Published byBarbra Campbell Modified over 9 years ago
1
PreprocessingComputePost Proc. XML Raw Data ETL SliceCompute Repeat Subgraph PageRank Initial Graph Analyz e Top Users
2
GraphX
3
HDFS Compute Spark Preprocess Spark Post. Raw Wikipedia XML HyperlinksPageRankTop 20 Pages 605 375
4
Id 3 7 5 2 SrcIdDstId 37 53 25 57 Property (E) Collaborator Advisor Colleague PI Property (V) (rxin, student) (jgonzal, postdoc) (franklin, professor) (istoica, professor) 3 7 5 2 Property GraphVertex Table Edge Table rxin stu. rxin stu. franklin, prof. istoica prof. istoica prof. jgonzal, pst.doc. Collab. PI Advisor Colleague
5
Data-ParallelGraph-Parallel Property Graph Table Result Row
6
Raw Wikipedia XML HyperlinksPageRankTop 20 Pages TitlePR Text Table TitleBody Topic Model (LDA) Word Topics WordTopic Editor Graph Community Detection User Community UserCom. Term-Doc Graph Discussion Table UserDisc. Community Topic Com.
7
Part. 2 Part. 1 Vertex Table (RDD) BC AD FE A D Property Graph Edge Table (RDD) A A B B A A C C C C D D B B C C A A E E A A F F E E F F E E D D B B C C D D E E A A F F Routing Table (RDD) B B C C D D E E A A F F 1 2 12 12 1 2 2D Vertex Cut Heuristic
8
Vertex CutEdge Cut
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.