Link Prediction on Hacker Networks

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks Power law graphs Small world graphs.
Advertisements

Link Prediction in Social Networks
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Analysis and Modeling of Social Networks Foudalis Ilias.
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
LINK PREDICTION IN CO-AUTHORSHIP NETWORK Le Nhat Minh ( A N) Supervisor: Dongyuan Lu Aobo Tao Chen 1.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
How is this going to make us 100K Applications of Graph Theory.
Algorithms for Data Mining and Querying with Graphs Investigators: Padhraic Smyth, Sharad Mehrotra University of California, Irvine Students: Joshua O’
Behavior Analytics Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Behavior Analytics Examples of Behavior Analytics What motivates.
Peer-to-Peer and Social Networks Random Graphs. Random graphs E RDÖS -R ENYI MODEL One of several models … Presents a theory of how social webs are formed.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Link Prediction.
Jerry Scripps N T O K M I N I N G E W R. Overview What is network mining? What is network mining? Motivation Motivation Preliminaries Preliminaries definitions.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
WEB SCIENCE: ANALYZING THE WEB. Graph Terminology Graph ~ a structure of nodes/vertices connected by edges The edges may be directed or undirected Distance.
Online Social Networks and Media Absorbing Random Walks Link Prediction.
1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.
1 Computing with Social Networks on the Web (2008 slide deck) Jennifer Golbeck University of Maryland, College Park Jim Hendler Rensselaer Polytechnic.
Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar.
Small-world networks. What is it? Everyone talks about the small world phenomenon, but truly what is it? There are three landmark papers: Stanley Milgram.
Social Network Analysis (1) LING 575 Fei Xia 01/04/2011.
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
The Link Prediction Problem for Social Networks David Libel-Nowell, MIT John Klienberg, Cornell Saswat Mishra sxm
Optimal Link Bombs are Uncoordinated Sibel Adali Tina Liu Malik Magdon-Ismail Rensselaer Polytechnic Institute.
3. SMALL WORLDS The Watts-Strogatz model. Watts-Strogatz, Nature 1998 Small world: the average shortest path length in a real network is small Six degrees.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Link Prediction.
LINK PREDICTION IN CO-AUTHORSHIP NETWORK Le Nhat Minh ( A N) Supervisor: Dongyuan Lu 1.
Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro
Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar.
Informatics tools in network science
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Link Prediction Class Data Mining Technology for Business and Society
1 Relates to Lab 4. This module covers link state routing and the Open Shortest Path First (OSPF) routing protocol. Dynamic Routing Protocols II OSPF.
Lecture 23: Structure of Networks
Structures of Networks
Due: a start of class Oct 12
Lecture 1: Complex Networks
Topics In Social Computing (67810)
COMP 3270 Computer Networks
Tutorial 12 Biological networks.
Due: a start of class Oct 26
Empirical analysis of Chinese airport network as a complex weighted network Methodology Section Presented by Di Li.
Lesson 2-9 AP Computer Science Principles
Link Prediction Seminar Social Media Mining University UC3M
De-anonymizing the Internet Using Unreliable IDs
Peer-to-Peer and Social Networks
IDENTIFICATION OF DENSE SUBGRAPHS FROM MASSIVE SPARSE GRAPHS
18-WAN Technologies and Dynamic routing
Graph Analysis by Persistent Homology
Lecture 23: Structure of Networks
Network Science: A Short Introduction i3 Workshop
The Watts-Strogatz model
Why Social Graphs Are Different Communities Finding Triangles
Peer-to-Peer and Social Networks Fall 2017
Department of Computer Science University of York
Routing Protocols Charles Warren.
Peer-to-Peer and Social Networks
Lecture 23: Structure of Networks
My Web Page is My First Line of Defense!
Proximity in Graphs by Using Random Walks
Presentation transcript:

Link Prediction on Hacker Networks Samantha Lee

Given a network with links in one time step, can we predict the links in the next time step? Given a network of hackers joined by links if they attacked the same victim, can we predict if a hacker will be linked with another hacker in the next time step?

Why do you care about it? Application of network science Increase understanding of attacker behavior - if a hacker is linked to another hacker, they could be sharing information Identify attackers who will collaborate Take preventive / offensive action

The Data Zone-H http://www.zone-h.org/ cybercrime archive 10 years of data: 2005-2014 Total of 9,528,165 defacements Website defacement is an attack on a website that changes the visual appearance of the site or a webpage. Approx. 9 million defacements

466,226 defacements A: country of attacker B: Date defacement was entered into Zone-H C: hacker name D: victim website E: victim IP address F: victim operating system G: victim server/software H: reason for attack I: Type of vulnerability J: mirror site address with actual images K: type of defacement - mass or regular L: did the attacker re-deface the same site? M: Published or not. N: primary or secondary O: unique defacement ID P: Physical location of the server that was defaced.

Interesting Attributes A: country of attacker B: Date defacement was entered into Zone-H C: hacker name E: victim IP address H: reason for attack P: Physical location of the server that was defaced

2005 Network

2006 Network

Link Prediction - who is linked next? Given: Graph G=(E,V); edge set E; node set V; e=(u,v) ∈ E with t(e) timestamp of the edge Let G[t,t’] be the subgraph with edges t < t(e) < t’ Output: Edges not in G[t0, t0’] predicted to be in G[t1,t1’] where t0 < t0’ < t1 < t1’ Note: core nodes

G=(V,E) t1 to t1’ t0 to t0’ link prediction

3 kinds of features to say how “close” 2 nodes are: Network factors (independent of actual data) Domain specific: Country, Reason for attack, Victim ip addresses & location of victim servers Individual attributes: total number of defacements

More on Network factors Shortest path distance Common neighbors Jaccard's coefficient Preferential attachment Katz Hitting time Page rank Jaccard’s coefficient - measures the probability that both x and y have a feature f, for a randomly selected feature f that either x or y has Katz - directly sums over this collection of paths, exponentially damped by length to count short paths more heavily

To Predict Links: My Features Common neighbors Country of the hacker or hacker group - hackers from the same country have a higher probability of linking Reason for attack - common goal unites

Upcoming Work Use subsets of features to do link prediction. Break my data into two sets, use the first part to try link prediction, use the second part to see how it went.

Sources https://en.wikipedia.org/wiki/Website_defacement http://be.amazd.com/link-prediction/ The link-prediction problem for social networks. David Liben-Nowell, Jon Kleinberg http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/12.pdf Cytoscape

Questions?