Graph and Tensor Mining for fun and profit

Slides:

Advertisements

Similar presentations

Beyond Streams and Graphs: Dynamic Tensor Analysis

Advertisements

CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.

Linear Regression.

15-826: Multimedia Databases and Data Mining

10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Multimedia Databases LSI and SVD. Text - Detailed outline text problem full text scanning inversion signature files clustering information filtering and.

Spatial Pyramid Pooling in Deep Convolutional

Representation learning for Knowledge Bases LivesIn BornIn LocateIn Friendship Nationality Nicole Kidman PerformIn Nationality Sydney Hugh Jackman Australia.

E.G.M. PetrakisDimensionality Reduction1  Given N vectors in n dims, find the k most important axes to project them  k is user defined (k < n)  Applications:

Overview of the Dynamic Learning Maps Alternate Assessment System

Information Retrieval in Practice

RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.

Ensemble Solutions for Link-Prediction in Knowledge Graphs

CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,

Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad

Selected Responses How deep does the question dig?????

CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.

DeepWalk: Online Learning of Social Representations

A Review of Relational Machine Learning for Knowledge Graphs CVML Reading Group Xiao Lin.

Learning Bayesian Networks for Complex Relational Data

Large Graph Mining: Power Tools and a Practitioner’s guide

True/False questions (3pts*2)

Computational Reasoning in High School Science and Math

Document Clustering Based on Non-negative Matrix Factorization

A Three-way Model for Collective Learning on Multi-Relational Data

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Computing Network Centrality Measures using Neural Networks

Intelligent Information System Lab

THE NATURE OF SCIENCE AND SCIENTIFIC METHODS

This Talk 1) Node embeddings 2) Graph neural networks 3) Applications

Graph and Tensor Mining for fun and profit

Word2Vec CS246 Junghoo “John” Cho.

Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad

LSI, SVD and Data Management

Distributed Representations of Subgraphs

Large Graph Mining: Power Tools and a Practitioner’s guide

Jun Xu Harbin Institute of Technology China

Variational Knowledge Graph Reasoning

Knowledge Base Completion

Knowledge Graph Embedding

Graph and Tensor Mining for fun and profit

Jimeng Sun · Charalampos (Babis) E

Graph and Tensor Mining for fun and profit

THE NATURE OF SCIENCE AND SCIENTIFIC METHODS

Qualitative Observation

Word Embedding Word2Vec.

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Graph and Tensor Mining for fun and profit

Classification in Complex Systems

Graph and Tensor Mining for fun and profit

Graph and Tensor Mining for fun and profit

Connecting Data with Domain Knowledge in Neural Networks -- Use Deep learning in Conventional problems Lizhong Zheng.

Neural Networks Geoff Hulten.

ML – Lecture 3B Deep NN.

Asymmetric Transitivity Preserving Graph Embedding

Information Networks: State of the Art

Next Generation Data Mining Tools: SVD and Fractals

Enriching Taxonomies With Functional Domain Knowledge

Word embeddings (continued)

NAACL‘18 Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung

VERB PHYSICS: Relative Physical Knowledge of Actions and Objects

Keshav Balasubramanian

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Heterogeneous Graph Convolutional Network

Peng Cui Tsinghua University

Learning to Detect Human-Object Interactions with Knowledge

Semantic Graph Representation Learning in Attributed Networks

Presentation transcript:

Graph and Tensor Mining for fun and profit Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD 2018 Dong+

‘Recipe’ Structure: Problem definition Short answer/solution LONG answer – details Conclusion/short-answer KDD 2018 Dong+

Tensors, e.g., Knowledge Graph C. Faloutsos Tensors, e.g., Knowledge Graph What is ‘normal’? Forecast? Spot errors? … Acted_in dates directed produced born_in KDD 2018 Dong+

… … P2.2 - Problem: embedding? C. Faloutsos P2.2 - Problem: embedding? Map entities, and predicates, to low-dim space(s) … Acted_in dates directed produced born_in … KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings entity emb. … Acted_in dates directed produced born_in pred. emb. KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings … locations … Acted_in dates directed produced born_in artists relations+ directed Acted_in dates Artistic-pred. KDD 2018 Dong+

Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Why embeddings (even for graphs)? KDD 2018 Dong+

Why embeddings (even for graphs)? Reason #1: easier to visualize / manipulate ? KDD 2018 Dong+

Why embeddings (even for graphs)? Reason #2: wild success of word2vec ‘queen’ = ‘king’ – ‘man’ + ‘woman’ Latent variable 2 (‘leadership’) Latent variable 1 (‘collaborative’) KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS x x = MD retrieval inf. lung brain data CS x x = MD KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS-concept MD-concept CS x x = retrieval CS-concept inf. lung MD-concept brain data CS x x = MD KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS-concept MD-concept CS x x = retrieval CS-concept inf. lung MD-concept brain data CS x x = MD KDD 2018 Dong+

SVD as embedding CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 CS x x = MD KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram Needs methods summarization KDD 2018 Dong+

Skip-Gram Borrowed from work on language model Sample a set of paths with random walk from node vi m𝑖𝑛 −𝑙𝑜𝑔 𝑣 𝑗 𝜖𝑁( 𝑣 𝑖 ) 𝑃( 𝑣 𝑗 | 𝑣 𝑖 ) 𝑃 𝑣 𝑗 𝑣 𝑖 = exp⁡( 𝑣 𝑖 𝑣 𝑗 ) 𝑣 𝑘 𝜖|𝑉| exp⁡(𝑣 𝑖 𝑣 𝑘 ) Solved with Hierarchical Softmax (DeepWalk), or Negative Sampling (Node2Vec) KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram heterogenous graph 1st order + 2nd order proximity closed form, unifies DeepWalk and Node2Vec reconstruct W, similar to SVD focuses on structural similarity Needs methods summarization “inductive”, sample and aggregate interesting! borrowed the idea from CNN KDD 2018 Dong+

Embedding can help with: Reconstruction Fact checking (Node classification) (Link prediction) (Recommendation) … KDD 2018 Dong+

Reconstruction of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Fact-Checking of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes verb subject object verb leaders musicians athletes KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes k: ‘leader’ i: ‘Merkel’ KDD 2018 j: ‘Germany’ Dong+

PARAFAC: as embedding = + leaders musicians athletes k: ‘leader’ i: ‘Merkel’ KDD 2018 j: ‘Germany’ Dong+

PARAFAC: as embedding ‘Merkel’: i-th subject vector: (1,0,0) ‘Germany’: j-th object vector: (1,0,0) ‘is_leader’: k-th verb vector: (1,0,0) = + leaders musicians athletes i k j KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ = + leaders musicians athletes KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Beatles’: j’-th object vector: 𝑜 =(0, 1, 0) ‘plays_bass’: k’-th verb vector 𝑣 =(0, 1, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

P2.2: Short answers S1. What is the relationship among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. What shall we optimize? Closed-world assumption Open-world assumption KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 KDD 2018 Dong+

TransE ‘Merkel’: ℎ =(1, 0, 0) ‘Germany’: 𝑡 =(1, 1, 0) ‘is_leader’: 𝑟 =(0, 1, 0) score (h, r, t) = -|| ℎ + 𝑟 − 𝑡 ||1/2 = 0 ‘Merkel’: ℎ =(1, 0, 0) ‘Beatles’: 𝑡′ =(0, 0, 1) ‘plays_bass’: 𝑟′ =(0, 0, 1) score (h, r, t) = -|| ℎ + 𝑟′ − 𝑡′ ||1/2 = -1 KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 What if multiple objects apply?? directed directed KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space Many simplifications of TransH and TransR STransE is reported to be the best in Dat Quoc Nguyen. An overview of embedding models of entities and relationships for knowledge base completion. KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt Too many parameters!! KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix KDD 2018 Dong+

RESCAL DistMult ‘Merkel’: h=(1, 0)T ‘Germany’: t =(0, 1)T ‘is_leader’: Wr= 1 1 1 1 score (h, r, t) = h⏉Wrt = ∑(h⊗t)⊙Wr =1 ‘Merkel’: h=(1, 0)T ‘Germany’: t =(1, 0)T ‘is_leader’: Wr= 1 0 0 1 score (h, r, t) = h⏉Wrt = ∑(h⊙t)⊙diag(Wr)=1 KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix Cannot deal with asymmetric relations!! KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix ComplEx: score(h,r,t) = Re(h⏉diag(r)t) Extend DistMult by introducing complex value embedding, so can handle asymmetric relations _ KDD 2018 Dong+

ComplEX ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟) ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟) ℎ⊙ 𝑡 = 𝑅 ℎ +𝑖𝐼 ℎ ⊙ 𝑅 𝑡 +𝑖𝐼 𝑡 =𝑅 ℎ ⊙𝑅 𝑡 +𝐼 ℎ ⊙𝐼 𝑡 +𝑖(𝐼 ℎ ⊙𝑅 𝑡 −𝑅 ℎ ⊙𝐼(𝑡)) 𝑅𝑒{ ℎ⊙ 𝑡 ⊙𝑟}=𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) KDD 2018 Dong+

ComplEX 𝑠𝑐𝑜𝑟𝑒 ℎ,𝑟, 𝑡 =∑𝑅𝑒 ℎ⊙ 𝑡 ⊙𝑟 =∑𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +∑𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +∑𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −∑𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) ≠𝑠𝑐𝑜𝑟𝑒 𝑡,𝑟, ℎ DistMult Asymmetry KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ ComplEx and ConvE have state-of-the-art results Reduce parameters Certain flexibility KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ DistMult is light-weight, and good in practice. KDD 2018 Dong+

𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 S2. Loss Function Closed world assumption: square loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 KDD 2018 Dong+

S2. Loss Function Closed world assumption: square loss Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Maybe add the logistic loss KDD 2018 Dong+

OWA works best S2. Loss Function Closed world assumption: square loss Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 OWA works best 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Not necessarily true KDD 2018 Dong+

Embedding Applications Learn embeddings from IMDb data and identify WikiData errors, using DistMult Subject Relation Target Reason The Moises Padilla Story writtenBy César Ámigo Aguilar Linkage error Bajrangi Bhaijaan Yo Yo Honey Singh Wrong relationship Piste noire Jalil Naciri Enter the Ninja musicComposedBy Michael Lewis The Secret Life of Words Hal Hartley Cannot confirm … KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Conclusion/Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Conclusion/Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption State-of-the-art results Multiplication model Open-world assumption KDD 2018 Dong+