Presentation is loading. Please wait.

Presentation is loading. Please wait.

James Abello, MSCS Director Computer Science Department Busch Campus

Similar presentations


Presentation on theme: "James Abello, MSCS Director Computer Science Department Busch Campus"— Presentation transcript:

1 James Abello, MSCS Director Computer Science Department Busch Campus
Graph Analyti Graph Analytics Primer James Abello, MSCS Director Computer Science Department Busch Campus

2 What is a Combinatorial Graph or Network G = (V,E) ?
Collection of Vertices V and “pairs” E of elements from V called Edges. Ex: V = {a,1,3,b, z} E = {{a,1}, {b,3}, {z,1}, {a,b}, {a,z} , {b,z} } Pictorially ? Graphs can have “weights”, “labels” or “time stamps” on the vertices and edges or more complicated meta-information. Edges may have directions.

3 Examples The Web, The Internet, Phone Calls, Maps,
Co-Occurrence, Paragraphs in Books, Family Trees, Authors and Papers, Airports Flights, Clicks on Web Sites, Friendships networks, Social Media, Biological Networks, Events Collection, Diseases – Symptoms –Treatments, Images(2d,3d), …

4 Graph Sources News, Scientific Publications, Astronomic Observations, Pictures, Social Events, Biology, Physics, Mathematics, Social Sciences, Health Care Data, Medicine, …

5 A helpful methaphore Effectively and Efficiently
How to go from point a to point b Effectively and Efficiently

6 Typical Tasks a. Create or Define Graphs of interest from your Data Sets. b. Data Access and Graph formation c. Identify the questions you want to answer d. Compute Graph Statistics: |V|, |E|, Connectivity, average degree, degree distribution, density, longest paths (Diameter), landmarks, most central vertexes, ….

7 Typical Tasks(cont) e. Define a “similarity” between the vertices f. Partition the vertices according to the similarity measure of interest (This has been called “Clustering” or sexier name today is “Unsupervised Learning” ) g. Interpret the clusters

8 Typical Tasks(cont) g. Interpret the clusters
h. (Feedback Loop) Incorporate this information back into your data and run modified algorithms of interest. i. Summarize the findings, publish or incorporate them into processes of interest.

9 Main Issues Define typical scenarios
b. How is the graph consumed by a user? Text? Visual Interface? On Demand? On a desktop? Special device? c. What are the interactivity requirements? d. How can we amplify a human user understanding of the graph data? (maps are good examples of successful stories) e. How do we access the “satellite” information associated with the graph data? f. I s the graph and its associated data public?

10 A system at work An example of current capabilities of graph manipulation systems that are in existence today. Go to Atlas demo

11 Atlas Local Graph Exploration in a Global Context Fred Hohman
IUI 2019 Fred Hohman @fredhohman Georgia Tech James Abello Rutgers Varun Bezzam Georgia Tech Polo Chau Georgia Tech

12

13 Graph Sensemaking

14 Global View Graph Sensemaking Local View

15 Global View Free Exploration Graph Sensemaking Local View Targeted Exploration

16 Important Structure Graph Sensemaking Important Nodes

17 Important Structure Graph Sensemaking Important Nodes

18 Important Structure Graph Sensemaking Important Nodes

19 HCI Human-computer Data Mining Interaction Automatic
User-driven, iterative Summarization, clustering, classification Interactive, visualization Millions of nodes Thousands of nodes

20 HCI Human-computer Data Mining Interaction Automatic
User-driven, iterative Summarization, clustering, classification Interactive, visualization Millions of nodes Thousands of nodes The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Sahu, et al. VLDB, 2017.

21 HCI Human-computer Data Mining Interaction Automatic
User-driven, iterative Summarization, clustering, classification Interactive, visualization Millions of nodes Thousands of nodes The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing. Sahu, et al. VLDB, 2017.

22 12

23 Atlas interactive graph exploration via scalable edge decomposition
bit.ly/atlas-iui interactive graph exploration via scalable edge decomposition 13

24 Atlas interactive graph exploration via scalable edge decomposition
bit.ly/atlas-iui interactive graph exploration via scalable edge decomposition separate graph into graph layers 13

25 Atlas interactive graph exploration via scalable edge decomposition
bit.ly/atlas-iui interactive graph exploration via scalable edge decomposition separate graph into graph layers reveal peculiar subgraph 13

26 Atlas interactive graph exploration via scalable edge decomposition
bit.ly/atlas-iui interactive graph exploration via scalable edge decomposition separate graph into graph layers reveal peculiar subgraph visualize local + global structure 13

27

28 1 3 4 5 1 5 4 1 2 2

29 peel = 1 1 3 4 5 1 5 4 1 2 2

30 peel = 1 3 4 2 5 4 2

31 peel = 2 3 4 2 5 4 2

32 peel = 2 3 3 3 3

33 peel = 3 3 3 3 3

34 peel = 3 3 3 3 3 3 3

35 peel = 1 1 1 5 1 2 1 1 2 2

36 peel = 1 1 2

37 peel = 1 2

38 peel = 2 2

39 peel = 2 2

40 peel = 1 1 1 5 1 1 1

41 peel = 1 1 1 1 1 1

42 3 1 1 3 3 1 3 3 1 1 2 3 2

43 graph layer 3 graph layer 1 1 3 3 3 1 graph layer 2 2 1 1 3 2

44 graph layer 3 graph layer 1 graph layer 2 vertex clones 1 3 3 3 1 2 1
3 3 1 graph layer 2 2 1 1 3 vertex clones 2 2

45 Graph Vertices Edges Time (s) Layers Google+ 24K 39K ~0 10
arXiv astro-ph 19K 198K 47 Amazon 335K 925K 6 US Patents 3.8M 17M 11 41 Wikipedia (German) 3.2M 82M 225 320 Orkut 3.1M 117M 92 91 32

46 Time complexity: O(#edges x #layers)
Graph Vertices Edges Time (s) Layers Google+ 24K 39K ~0 10 arXiv astro-ph 19K 198K 47 Amazon 335K 925K 6 US Patents 3.8M 17M 11 41 Wikipedia (German) 3.2M 82M 225 320 Orkut 3.1M 117M 92 91 Time complexity: O(#edges x #layers) layers << edges 32

47 Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure
Alok Tripathy, Fred Hohman, Duen Horng (Polo) Chau, Oded Green IEEE International Conference on Big Data. Seattle, WA, USA, 2018. 33

48 GPU + dynamic graph data structure -> 4x - 8x speed up over ParK
Scalable K-Core Decomposition for Static Graphs Using a Dynamic Graph Data Structure Alok Tripathy, Fred Hohman, Duen Horng (Polo) Chau, Oded Green IEEE International Conference on Big Data. Seattle, WA, USA, 2018. GPU + dynamic graph data structure -> 4x - 8x speed up over ParK 33

49 Demo: Understanding Word Embedding Graph
Nodes: 66K words from Wikipedia Edges: 214K (connect words with small distance) families of birds caeciliidae caeciliidae worm-like amphibians families of sea snails 34

50

51

52 User Study Goal: use Atlas to spot interesting patterns, mimicking their own work Graph Analysts Researcher, Symantec Graphs Yelp Reviews Network Researcher, NASA Systems engineer, NASA All PhDs + use graphs daily or weekly SEC Insider Trading Graph GloVe Word Embed. Graph Intro questionnaire → Atlas tutorial → Study → Exit questionnaire

53 User Study Findings 38

54 User Study Findings 3D for overview, 2D for details 38

55 User Study Findings 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph 38

56 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely 38

57 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely Show nearest neighbors used frequently 38

58 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely Show nearest neighbors used frequently Identifying and linking meaningful graph substructures 38

59 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely Show nearest neighbors used frequently Identifying and linking meaningful graph substructures Vertex clones as traversal mechanism between layers 38

60 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely Show nearest neighbors used frequently Identifying and linking meaningful graph substructures Vertex clones as traversal mechanism between layers Application to anomaly detection 38

61 User Study Findings • 3D for overview, 2D for details
3D useful for intro to new data → get a “feel” for the graph Graph Ribbon + Layers view used more precisely Show nearest neighbors used frequently Identifying and linking meaningful graph substructures Vertex clones as traversal mechanism between layers Application to anomaly detection “…analysis (using [both] vertex clones and layers) naturally reveals potentially anomalous substructures and vertices. This is highly useful from a cybersecurity perspective.” 38

62 Future Work

63 Future Work Automatically suggest interesting layers

64 Future Work Automatically suggest interesting layers Dynamic graph decomposition visualization

65 Future Work Automatically suggest interesting layers Dynamic graph decomposition visualization Visual scalability (e.g., super-noding, edge bundling, graph motif)

66 Atlas bit.ly/atlas-iui Local Graph Exploration in a Global Context
Thanks! Fred Hohman @fredhohman James Abello bit.ly/atlas-iui families of birds Varun Bezzam caeciliidae caeciliidae worm-like amphibians Polo Chau families of sea snails families of land creatures caeciliidae We thank the anonymous reviewers for their constructive feedback.


Download ppt "James Abello, MSCS Director Computer Science Department Busch Campus"

Similar presentations


Ads by Google