Graph and Tensor Mining for fun and profit

Slides:



Advertisements
Similar presentations
Beyond Streams and Graphs: Dynamic Tensor Analysis
Advertisements

CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
Linear Regression.
15-826: Multimedia Databases and Data Mining
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Multimedia Databases LSI and SVD. Text - Detailed outline text problem full text scanning inversion signature files clustering information filtering and.
Spatial Pyramid Pooling in Deep Convolutional
Representation learning for Knowledge Bases LivesIn BornIn LocateIn Friendship Nationality Nicole Kidman PerformIn Nationality Sydney Hugh Jackman Australia.
E.G.M. PetrakisDimensionality Reduction1  Given N vectors in n dims, find the k most important axes to project them  k is user defined (k < n)  Applications:
Overview of the Dynamic Learning Maps Alternate Assessment System
Information Retrieval in Practice
RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.
Ensemble Solutions for Link-Prediction in Knowledge Graphs
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Selected Responses How deep does the question dig?????
CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
DeepWalk: Online Learning of Social Representations
A Review of Relational Machine Learning for Knowledge Graphs CVML Reading Group Xiao Lin.
Learning Bayesian Networks for Complex Relational Data
Large Graph Mining: Power Tools and a Practitioner’s guide
True/False questions (3pts*2)
Computational Reasoning in High School Science and Math
Document Clustering Based on Non-negative Matrix Factorization
A Three-way Model for Collective Learning on Multi-Relational Data
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Computing Network Centrality Measures using Neural Networks
Intelligent Information System Lab
THE NATURE OF SCIENCE AND SCIENTIFIC METHODS
This Talk 1) Node embeddings 2) Graph neural networks 3) Applications
Graph and Tensor Mining for fun and profit
Word2Vec CS246 Junghoo “John” Cho.
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
LSI, SVD and Data Management
Distributed Representations of Subgraphs
Large Graph Mining: Power Tools and a Practitioner’s guide
Jun Xu Harbin Institute of Technology China
Variational Knowledge Graph Reasoning
Knowledge Base Completion
Knowledge Graph Embedding
Graph and Tensor Mining for fun and profit
Jimeng Sun · Charalampos (Babis) E
Graph and Tensor Mining for fun and profit
THE NATURE OF SCIENCE AND SCIENTIFIC METHODS
Qualitative Observation
Word Embedding Word2Vec.
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Graph and Tensor Mining for fun and profit
Classification in Complex Systems
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Connecting Data with Domain Knowledge in Neural Networks -- Use Deep learning in Conventional problems Lizhong Zheng.
Neural Networks Geoff Hulten.
ML – Lecture 3B Deep NN.
Asymmetric Transitivity Preserving Graph Embedding
Information Networks: State of the Art
Next Generation Data Mining Tools: SVD and Fractals
Enriching Taxonomies With Functional Domain Knowledge
Word embeddings (continued)
NAACL‘18 Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung
VERB PHYSICS: Relative Physical Knowledge of Actions and Objects
Keshav Balasubramanian
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Heterogeneous Graph Convolutional Network
Peng Cui Tsinghua University
Learning to Detect Human-Object Interactions with Knowledge
Semantic Graph Representation Learning in Attributed Networks
Presentation transcript:

Graph and Tensor Mining for fun and profit Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD 2018 Dong+

‘Recipe’ Structure: Problem definition Short answer/solution LONG answer – details Conclusion/short-answer KDD 2018 Dong+

Tensors, e.g., Knowledge Graph C. Faloutsos Tensors, e.g., Knowledge Graph What is ‘normal’? Forecast? Spot errors? … Acted_in dates directed produced born_in KDD 2018 Dong+

… … P2.2 - Problem: embedding? C. Faloutsos P2.2 - Problem: embedding? Map entities, and predicates, to low-dim space(s) … Acted_in dates directed produced born_in … KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings entity emb. … Acted_in dates directed produced born_in pred. emb. KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings … locations … Acted_in dates directed produced born_in artists relations+ directed Acted_in dates Artistic-pred. KDD 2018 Dong+

Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Why embeddings (even for graphs)? KDD 2018 Dong+

Why embeddings (even for graphs)? Reason #1: easier to visualize / manipulate ? KDD 2018 Dong+

Why embeddings (even for graphs)? Reason #2: wild success of word2vec ‘queen’ = ‘king’ – ‘man’ + ‘woman’ Latent variable 2 (‘leadership’) Latent variable 1 (‘collaborative’) KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS x x = MD retrieval inf. lung brain data CS x x = MD KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS-concept MD-concept CS x x = retrieval CS-concept inf. lung MD-concept brain data CS x x = MD KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS-concept MD-concept CS x x = retrieval CS-concept inf. lung MD-concept brain data CS x x = MD KDD 2018 Dong+

SVD as embedding CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 CS x x = MD KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram Needs methods summarization KDD 2018 Dong+

Skip-Gram Borrowed from work on language model Sample a set of paths with random walk from node vi m𝑖𝑛 −𝑙𝑜𝑔 𝑣 𝑗 𝜖𝑁( 𝑣 𝑖 ) 𝑃( 𝑣 𝑗 | 𝑣 𝑖 ) 𝑃 𝑣 𝑗 𝑣 𝑖 = exp⁡( 𝑣 𝑖 𝑣 𝑗 ) 𝑣 𝑘 𝜖|𝑉| exp⁡(𝑣 𝑖 𝑣 𝑘 ) Solved with Hierarchical Softmax (DeepWalk), or Negative Sampling (Node2Vec) KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram heterogenous graph 1st order + 2nd order proximity closed form, unifies DeepWalk and Node2Vec reconstruct W, similar to SVD focuses on structural similarity Needs methods summarization “inductive”, sample and aggregate interesting! borrowed the idea from CNN KDD 2018 Dong+

Embedding can help with: Reconstruction Fact checking (Node classification) (Link prediction) (Recommendation) … KDD 2018 Dong+

Reconstruction of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Fact-Checking of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD 𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Familiar embedding: SVD u1 u2 v1 v2 KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes verb subject object verb leaders musicians athletes KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes k: ‘leader’ i: ‘Merkel’ KDD 2018 j: ‘Germany’ Dong+

PARAFAC: as embedding = + leaders musicians athletes k: ‘leader’ i: ‘Merkel’ KDD 2018 j: ‘Germany’ Dong+

PARAFAC: as embedding ‘Merkel’: i-th subject vector: (1,0,0) ‘Germany’: j-th object vector: (1,0,0) ‘is_leader’: k-th verb vector: (1,0,0) = + leaders musicians athletes i k j KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ = + leaders musicians athletes KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) × × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Beatles’: j’-th object vector: 𝑜 =(0, 1, 0) ‘plays_bass’: k’-th verb vector 𝑣 =(0, 1, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

P2.2: Short answers S1. What is the relationship among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. What shall we optimize? Closed-world assumption Open-world assumption KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 KDD 2018 Dong+

TransE ‘Merkel’: ℎ =(1, 0, 0) ‘Germany’: 𝑡 =(1, 1, 0) ‘is_leader’: 𝑟 =(0, 1, 0) score (h, r, t) = -|| ℎ + 𝑟 − 𝑡 ||1/2 = 0 ‘Merkel’: ℎ =(1, 0, 0) ‘Beatles’: 𝑡′ =(0, 0, 1) ‘plays_bass’: 𝑟′ =(0, 0, 1) score (h, r, t) = -|| ℎ + 𝑟′ − 𝑡′ ||1/2 = -1 KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 What if multiple objects apply?? directed directed KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space KDD 2018 Dong+

S1. Triple Scoring - Addition Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space Many simplifications of TransH and TransR STransE is reported to be the best in Dat Quoc Nguyen. An overview of embedding models of entities and relationships for knowledge base completion. KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt Too many parameters!! KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix KDD 2018 Dong+

RESCAL DistMult ‘Merkel’: h=(1, 0)T ‘Germany’: t =(0, 1)T ‘is_leader’: Wr= 1 1 1 1 score (h, r, t) = h⏉Wrt = ∑(h⊗t)⊙Wr =1 ‘Merkel’: h=(1, 0)T ‘Germany’: t =(1, 0)T ‘is_leader’: Wr= 1 0 0 1 score (h, r, t) = h⏉Wrt = ∑(h⊙t)⊙diag(Wr)=1 KDD 2018 Dong+

S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix Cannot deal with asymmetric relations!! KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix ComplEx: score(h,r,t) = Re(h⏉diag(r)t) Extend DistMult by introducing complex value embedding, so can handle asymmetric relations _ KDD 2018 Dong+

ComplEX ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟) ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟) ℎ⊙ 𝑡 = 𝑅 ℎ +𝑖𝐼 ℎ ⊙ 𝑅 𝑡 +𝑖𝐼 𝑡 =𝑅 ℎ ⊙𝑅 𝑡 +𝐼 ℎ ⊙𝐼 𝑡 +𝑖(𝐼 ℎ ⊙𝑅 𝑡 −𝑅 ℎ ⊙𝐼(𝑡)) 𝑅𝑒{ ℎ⊙ 𝑡 ⊙𝑟}=𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) KDD 2018 Dong+

ComplEX 𝑠𝑐𝑜𝑟𝑒 ℎ,𝑟, 𝑡 =∑𝑅𝑒 ℎ⊙ 𝑡 ⊙𝑟 =∑𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +∑𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +∑𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −∑𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) ≠𝑠𝑐𝑜𝑟𝑒 𝑡,𝑟, ℎ DistMult Asymmetry KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ ComplEx and ConvE have state-of-the-art results Reduce parameters Certain flexibility KDD 2018 Dong+

S1. Triple Scoring - Multiply Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ DistMult is light-weight, and good in practice. KDD 2018 Dong+

𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 S2. Loss Function Closed world assumption: square loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 KDD 2018 Dong+

S2. Loss Function Closed world assumption: square loss Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Maybe add the logistic loss KDD 2018 Dong+

OWA works best S2. Loss Function Closed world assumption: square loss Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 OWA works best 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Not necessarily true KDD 2018 Dong+

Embedding Applications Learn embeddings from IMDb data and identify WikiData errors, using DistMult Subject Relation Target Reason The Moises Padilla Story writtenBy César Ámigo Aguilar Linkage error Bajrangi Bhaijaan Yo Yo Honey Singh Wrong relationship Piste noire Jalil Naciri Enter the Ninja musicComposedBy Michael Lewis The Secret Life of Words Hal Hartley Cannot confirm … KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding Knowledge Graph Embedding KDD 2018 Dong+

Conclusion/Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Conclusion/Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption State-of-the-art results Multiplication model Open-world assumption KDD 2018 Dong+