Graph and Tensor Mining for fun and profit

Graph and Tensor Mining for fun and profit
Faloutsos Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors
Faloutsos Roadmap Introduction – Motivation Part#1: Graphs Part#2: Tensors P2.1: Basics (dfn, PARAFAC) P2.2: Embeddings & mining P2.3: Inference Conclusions KDD 2018 Dong+

‘Recipe’ Structure: Problem definition Short answer/solution
LONG answer – details Conclusion/short-answer KDD 2018 Dong+

Tensors, e.g., Knowledge Graph
C. Faloutsos Tensors, e.g., Knowledge Graph What is ‘normal’? Forecast? Spot errors? … Acted_in dates directed produced born_in KDD 2018 Dong+

… … P2.2 - Problem: embedding?
C. Faloutsos P2.2 - Problem: embedding? Map entities, and predicates, to low-dim space(s) … Acted_in dates directed produced born_in … KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings
entity emb. … Acted_in dates directed produced born_in pred. emb. KDD 2018 Dong+

P2.2: Problem: embedding Given entities & predicates, find mappings …
locations … Acted_in dates directed produced born_in artists relations+ directed Acted_in dates Artistic-pred. KDD 2018 Dong+

Short answer S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Long Answer Motivation Graph Embedding Tensor Embedding
Knowledge Graph Embedding KDD 2018 Dong+

Why embeddings (even for graphs)?
KDD 2018 Dong+

Reason #1: easier to visualize / manipulate ? KDD 2018 Dong+

Reason #2: wild success of word2vec ‘queen’ = ‘king’ – ‘man’ + ‘woman’ Latent variable 2 (‘leadership’) Latent variable 1 (‘collaborative’) KDD 2018 Dong+

Familiar embedding: SVD
u1 u2 v1 v2 KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS x x = MD retrieval inf. lung
brain data CS x x = MD KDD 2018 Dong+

SVD as embedding A = U L VT - example: CS-concept MD-concept CS x x =
retrieval CS-concept inf. lung MD-concept brain data CS x x = MD KDD 2018 Dong+

SVD as embedding CS x x = MD 𝑫 𝟐 : embedding of document D2
retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 CS x x = MD KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE
Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram Needs methods summarization KDD 2018 Dong+

Skip-Gram Borrowed from work on language model
Sample a set of paths with random walk from node vi m𝑖𝑛 −𝑙𝑜𝑔 𝑣 𝑗 𝜖𝑁( 𝑣 𝑖 ) 𝑃( 𝑣 𝑗 | 𝑣 𝑖 ) 𝑃 𝑣 𝑗 𝑣 𝑖 = exp⁡( 𝑣 𝑖 𝑣 𝑗 ) 𝑣 𝑘 𝜖|𝑉| exp⁡(𝑣 𝑖 𝑣 𝑘 ) Solved with Hierarchical Softmax (DeepWalk), or Negative Sampling (Node2Vec) KDD 2018 Dong+

Deep Graph Embeddings Skip-gram DeepWalk Node2Vec Metapath2Vec LINE
Faloutsos Deep Graph Embeddings DeepWalk Node2Vec Metapath2Vec LINE UltimateWalk AutoEncoder Struc2Vec GraphSAGE GCN … Skip-gram heterogenous graph 1st order + 2nd order proximity closed form, unifies DeepWalk and Node2Vec reconstruct W, similar to SVD focuses on structural similarity Needs methods summarization “inductive”, sample and aggregate interesting! borrowed the idea from CNN KDD 2018 Dong+

Embedding can help with:
Reconstruction Fact checking (Node classification) (Link prediction) (Recommendation) … KDD 2018 Dong+

Reconstruction of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD
𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Fact-Checking of (2,4) A: 𝐷 2 x 𝑇 4 = 0 ? CS x x = MD
𝑫 𝟐 : embedding of document D2 retrieval inf. lung brain data 𝑻 𝟒 : embedding of term T4 ? CS x x = MD KDD 2018 Dong+

Familiar embedding: SVD
u1 u2 v1 v2 KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes verb subject
object verb leaders musicians athletes KDD 2018 Dong+

PARAFAC: as embedding = + leaders musicians athletes k: ‘leader’
i: ‘Merkel’ KDD 2018 j: ‘Germany’ Dong+

PARAFAC: as embedding ‘Merkel’: i-th subject vector: (1,0,0)
‘Germany’: j-th object vector: (1,0,0) ‘is_leader’: k-th verb vector: (1,0,0) = + leaders musicians athletes i k j KDD 2018 Dong+

Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0)
× × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ = + leaders musicians athletes KDD 2018 Dong+

× × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Germany’: j-th object vector: 𝑜 =(1, 0, 0) ‘is_leader’: k-th verb vector: 𝑣 =(1, 0, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

× × × Reconstruction ‘Merkel’: i-th subject vector: 𝑠 =(1, 0, 0) ‘Beatles’: j’-th object vector: 𝑜 =(0, 1, 0) ‘plays_bass’: k’-th verb vector 𝑣 =(0, 1, 0) A: 𝑥 𝑖,𝑗,𝑘 = ℎ=1 3 𝑠 𝑖,ℎ 𝑜 𝑗,ℎ 𝑣 𝑘,ℎ Intuitively: s,v,o: should have common ’concepts’ = + leaders musicians athletes KDD 2018 Dong+

P2.2: Short answers S1. What is the relationship among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. What shall we optimize? Closed-world assumption Open-world assumption KDD 2018 Dong+

S1. Triple Scoring - Addition
Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 KDD 2018 Dong+

TransE ‘Merkel’: ℎ =(1, 0, 0) ‘Germany’: 𝑡 =(1, 1, 0)
‘is_leader’: 𝑟 =(0, 1, 0) score (h, r, t) = -|| ℎ + 𝑟 − 𝑡 ||1/2 = 0 ‘Merkel’: ℎ =(1, 0, 0) ‘Beatles’: 𝑡′ =(0, 0, 1) ‘plays_bass’: 𝑟′ =(0, 0, 1) score (h, r, t) = -|| ℎ + 𝑟′ − 𝑡′ ||1/2 = -1 KDD 2018 Dong+

Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 What if multiple objects apply?? directed directed KDD 2018 Dong+

Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes KDD 2018 Dong+

Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space KDD 2018 Dong+

Addition: h + r =?= t TransE score(h,r,t) = -||h+r-t||1/2 TransH: project to relation-specific hyperplanes TransR: translate to relation-specific space Many simplifications of TransH and TransR STransE is reported to be the best in Dat Quoc Nguyen. An overview of embedding models of entities and relationships for knowledge base completion. KDD 2018 Dong+

S1. Triple Scoring - Multiply
Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt Too many parameters!! KDD 2018 Dong+

Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix KDD 2018 Dong+

RESCAL DistMult ‘Merkel’: h=(1, 0)T ‘Germany’: t =(0, 1)T
‘is_leader’: Wr= score (h, r, t) = h⏉Wrt = ∑(h⊗t)⊙Wr =1 ‘Merkel’: h=(1, 0)T ‘Germany’: t =(1, 0)T ‘is_leader’: Wr= score (h, r, t) = h⏉Wrt = ∑(h⊙t)⊙diag(Wr)=1 KDD 2018 Dong+

Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix Cannot deal with asymmetric relations!! KDD 2018 Dong+

Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t Simplify RESCAL by using a diagonal matrix ComplEx: score(h,r,t) = Re(h⏉diag(r)t) Extend DistMult by introducing complex value embedding, so can handle asymmetric relations _ KDD 2018 Dong+

ComplEX ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟)
ℎ=𝑅 ℎ +𝑖𝐼 ℎ , 𝑡=𝑅 𝑡 +𝑖𝐼 𝑡 , 𝑟=𝑅 𝑟 +𝑖𝐼(𝑟) ℎ⊙ 𝑡 = 𝑅 ℎ +𝑖𝐼 ℎ ⊙ 𝑅 𝑡 +𝑖𝐼 𝑡 =𝑅 ℎ ⊙𝑅 𝑡 +𝐼 ℎ ⊙𝐼 𝑡 +𝑖(𝐼 ℎ ⊙𝑅 𝑡 −𝑅 ℎ ⊙𝐼(𝑡)) 𝑅𝑒{ ℎ⊙ 𝑡 ⊙𝑟}=𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) KDD 2018 Dong+

ComplEX 𝑠𝑐𝑜𝑟𝑒 ℎ,𝑟, 𝑡 =∑𝑅𝑒 ℎ⊙ 𝑡 ⊙𝑟 =∑𝑅 ℎ ⊙𝑅 𝑡 ⊙𝑅 𝑟 +∑𝐼 ℎ ⊙𝐼 𝑡 ⊙𝑅 𝑟 +∑𝑅 ℎ ⊙𝐼 𝑡 ⊙𝐼 𝑟 −∑𝐼(ℎ)⊙𝑅(𝑡)⊙𝐼(𝑟) ≠𝑠𝑐𝑜𝑟𝑒 𝑡,𝑟, ℎ DistMult Asymmetry KDD 2018 Dong+

Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ KDD 2018 Dong+

Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ ComplEx and ConvE have state-of-the-art results Reduce parameters Certain flexibility KDD 2018 Dong+

Faloutsos S1. Triple Scoring - Multiply Multiplication: h ⚬ r =?= t RESCAL: score(h,r,t) = h⏉Wrt DistMult: score(h,r,t) = h⏉diag(r)t ComplEx: score(h,r,t) = Re(h⏉diag(r)t) ConvE: Use convolutional NN to reduce paras _ DistMult is light-weight, and good in practice. KDD 2018 Dong+

𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2
S2. Loss Function Closed world assumption: square loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 KDD 2018 Dong+

S2. Loss Function Closed world assumption: square loss
Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Maybe add the logistic loss KDD 2018 Dong+

OWA works best S2. Loss Function Closed world assumption: square loss
Faloutsos S2. Loss Function Closed world assumption: square loss Open world assumption: triplet loss 𝐿= ℎ,𝑡∈𝐸, 𝑟∈𝑅 ( 𝑦 ℎ,𝑟,𝑡 −𝑓 ℎ,𝑟,𝑡 ) 2 OWA works best 𝐿= 𝑇+ 𝑇− 𝑚𝑎𝑥(0, 𝛾−𝑓 ℎ,𝑟,𝑡 +𝑓 ℎ′,𝑟′,𝑡′ ) Not necessarily true KDD 2018 Dong+

Embedding Applications
Learn embeddings from IMDb data and identify WikiData errors, using DistMult Subject Relation Target Reason The Moises Padilla Story writtenBy César Ámigo Aguilar Linkage error Bajrangi Bhaijaan Yo Yo Honey Singh Wrong relationship Piste noire Jalil Naciri Enter the Ninja musicComposedBy Michael Lewis The Secret Life of Words Hal Hartley Cannot confirm … KDD 2018 Dong+

Conclusion/Short answer
S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption KDD 2018 Dong+

Conclusion/Short answer
S1. Two potential relationships among sub (h), pred (r), and obj (t)? Addition: h + r =?= t Multiplication: h ⚬ r =?= t S2. Two loss function options Closed-world assumption Open-world assumption State-of-the-art results Multiplication model Open-world assumption KDD 2018 Dong+

Graph and Tensor Mining for fun and profit

Similar presentations

Presentation on theme: "Graph and Tensor Mining for fun and profit"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Graph and Tensor Mining for fun and profit

Similar presentations

Presentation on theme: "Graph and Tensor Mining for fun and profit"— Presentation transcript:

Similar presentations

About project

Feedback