Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly.

Similar presentations


Presentation on theme: "Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly."— Presentation transcript:

1 Graph-RAT Overview By Daniel McEnnis

2 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly evaluate all different ways of performing recommendation

3 3/32 Kinds of Analysis  Recommendation Systems  Relational Machine Learning  Data Mining  MIR document retrieval

4 4/32 Talk Outline  Base Components  Queries  Algorithms  Schedulers  Graph-RAT Language  Conclusion and Examples

5 5/32 Base Components  Graphs  Actors  Links  Properties A B E C D AA B E C D AA B E C D A [Vector] Hiking Biking 22 John A Name Age Hobbies Library

6 6/32 Properties  Variables of Graph-RAT  Can be arbitrary Java types  Can be attached to anything  Unique ID string for each object  Accessed only as sets, not as objects

7 7/32 Data View  Hyper-graph structure defined by the set of actors and links in a graph  Accessible from the enclosing graph  Can be cyclic A B E C D AA B E C D AA B E C D A

8 8/32 Metadata View  Not constructed by default  Implicit graph described by modes and the relations between them  Needed for relational machine learning User Friend

9 9/32 Query Language  Constructs sets retrieved from a graph  Functional structure  Similar to SQL  4 types  Graph Queries  Actor Queries  Link Queries  Property Queries

10 10/32 Query Structure  Cascading queries in a LISP style syntax  Each child query is of a different type  Restrictions can be added at runtime

11 11/32 Query Examples  LinkByActor(  false,  ActorByMode(false, “Target”,”.*”)  ActorByMode(false, “Source”,”.*”)  SetOperation.XOR)

12 12/32 Query Comparisons  Similar to the JENA interface  Construction is similar to Jung system  Implements all SQL queries that do not require temporary tables

13 13/32 0.4.3 Query  Uses graph primitives instead of Queries  Algorithms use hard-coded GraphByID

14 14/32 Algorithms  Functions that execute over a given graph  Metadata is a part of the algorithm  Properties utilized or created are declared up front.  Excepting output algorithms, no side effects are permitted. execute(Graph graph) IODescriptor getInput() IODescriptor getOuput()

15 15/32 Propositional Algorithms  Utilizes aggregator function as a parameter  Crosses all ways of shifting data  Aggregate By Link  Aggregate By Link Property  Aggregate On Graph  Graph To Actor  Link To Graph  Graph To Graph

16 16/32 Aggregator Functions  1 or more elements to equal or fewer elements  Examples Statistical Moments Arithmetic Operations Null Aggregation Concatentation

17 17/32 Social Network Analysis Algorithms  Prestige Algorithms  Degree  Betweeness  Closeness  Page Rank  HITS  Graph Triples

18 18/32 Classification Algorithms  Machine Learning Primitives  Uses Weka  Separate algorithms for training and classifying

19 19/32 Clustering Algorithms  Several graph-based algorithms  Weak Component Clustering  Strong Component Clustering  Edge Betweeness Clustering  Norman-Girvan Edge Betweeness  Also has primitives calling Weka on vector data

20 20/32 Similarity Algorithms  Comparisons between modes  Types of Similarity Similarity By Link Similarity By Property Graph Similarity  Distance Functions All Weka distance functions KLDistance Exponential Distance

21 21/32 Collaborative Filtering Algorithms  Traditional recommendation algorithms  Item to Item  User to User  Associative Mining

22 22/32 Array-Based Algorithms  Transform To Array  Principal Component Analysis

23 23/32 Evaluation  All forms of evaluating results  Set Based (precision and recall)  Weighted Set (Correlations)  Ordered Lists (Kendall Tau, Half Life)  Cross-Validation algorithms  By Actor  By Link  By Graph

24 24/32 Data Acquisition  Components for acquiring source data  File Reader Types  Reading different file formats  Web Crawling Types  LiveJournal or LastFM  Connection Types  Links different sets together

25 25/32 Web Crawler  Custom Multi-threaded web crawler  Dynamic parsers  Properties passing between both crawls and parser execution  Stop and filter conditions are parameterized

26 26/32 Existing Parsers  Base HTML parsing  XML Parsing (SAX)  LiveJournal FOAF  LastFM REST services  Graph-RAT documents  Yahoo search queries

27 27/32 Comparisons  SQL  LINQ  Matlab  Other graph packages  Prolog?

28 28/32 Embedded Use  Dynamic Loading  AbstractFactory abstract superclass  Example - Retrieving links to YouTube videos from GData

29 29/32 Graph-RAT Language  Base Graph-RAT:  Data Acquisition components executed  For each algorithm entry:  Graph Query selects a set of graphs  Algorithm is executed over each graph  Cross-Validation Graph-RAT  Mode, relation, or graph chosen in advance,  Data Acquisition components run once  Algorithm entries rerun for each fold  Statistical Graph-RAT  List of cross-validation schedulers  Statistical metrics of which performed better

30 30/32 User To User Collaborative Filtering Example  Aggregate By Link(Artist->User)  Similarity By Link (User->User)  Aggregate By Link (User->User)  Property to Link (User->Artist)

31 31/32 Setup Example

32

33 33/32 DataAquisition Crawl LastFM Proxy proxy.waikato.ac.nz …

34 34/32 Query Entry.*

35 Algorithm Entry … GraphTriples Relation Friends Destination TriplesVector …

36 36/32 Future Work  Stabilization - 0.5.1 to beta  Statistical testing on result sets  Upgrading the GUI interface  Memory performance upgrades  Octave Integration

37 37/32 Questions?  http://graph-rat.sourceforge.net http://graph-rat.sourceforge.net  Stable (beta) release is 0.4.3


Download ppt "Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly."

Similar presentations


Ads by Google