Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Consistency and Conformance of Web Document Collection Based on Heterogeneous DAC Graph Marek Kopel and Aleksander Zgrzywa

Similar presentations


Presentation on theme: "The Consistency and Conformance of Web Document Collection Based on Heterogeneous DAC Graph Marek Kopel and Aleksander Zgrzywa"— Presentation transcript:

1 The Consistency and Conformance of Web Document Collection Based on Heterogeneous DAC Graph Marek Kopel and Aleksander Zgrzywa www.iis.pwr.wroc.pl www.zsi.pwr.wroc.pl

2 Outline Background & Idea Personal Web of Trust User and Agent Trust Local Document Ranking & Filtering Example Scenario Conclusions & Future Work 2

3 Relationships in WWW directed graph - most common model of a Web document collection documents' hyperlinking relationship (edges) → PageRank, HITS Tim Berners-Lee (in reference to the social aspect of Web 2.0): “I called this graph the Semantic Web, but maybe it should have been Giant Global Graph!” 3

4 Relationships in WWW (2) There's more to hyperlink than href: HTML 4.01 attributes rel and rev - e.g: used definitions –navigation in a document collection (start, prev, next, contents, index), –structure (chapter, section, subsection, appendix, glossary) –meta (copyright, help) XHTML 2.0 – custom namespaces 4

5 Relationships in WWW (3) Popular relation ontologies: FOAF XFN microformat –friendship (contact, acquaintance, friend) –family (child, parent, sibling, spouse, kin) –professional (co-worker, colleague) –physical (met) –geographical (co-resident, neighbor) –romantic (muse, crush, date, sweetheart) rel-tag microformat - folksonomies 5

6 Heterogeneous DAC Graph DAC graph –nodes of three types: Document Author Concept –edges between nodes model the relationships –most of the relationships can be acquired directly from the Web data 6

7 Consistency and Conformance Consistency of a Web document collection –inner similarity concerning subject similarly tagged (Web 2.0) –authors assigned the same tags or categories same keywords (digital libraries) Conformance of a Web document collection –document authors' relationship –authors with strong relationship → often coauthors (agree on some subjects) –citing and referencing –Web of Trust 7

8 Relationships in DAC Graph Document Author Concept 8

9 fragment of a DAC graph of a Web document collection 9 d1d1 d2d2 d3d3 d4d4 a1a1 a2a2 c1c1 c2c2

10 Consistency Collection document-concept graph 10 d1d1 d2d2 d3d3 d4d4 a1a1 a2a2 c1c1 c2c2

11 Conformance Collection document-author graph 11 a1a1 d1d1 d2d2 d3d3 d4d4 a2a2 c1c1 c2c2

12 Deriving Relationships d1d1 c1c1 a1a1 c2c2 c4c4 c3c3 c5c5 a2a2 a3a3 12

13 Consistency and Conformance Subgraphs are clustered –only the relationships’ values consistency collection graph –output biggest cluster’s doc. nodes → consistent subcollection conformance collection graph – → conformable subcollection C – Web document collection cons_subc(C) – consistent subcollection of C conf_subc(C) – conformable subcollection of C

14 Conclusions and Future Work Relationships are asymmetric, so undirected → directed graph Relationship deriving using: paths with one → n proxy nodes Graph clustering: –MCA - Markov Cluster Algorithm (currently) –Other algorithms –Maximum clique technique 14

15 Q & A


Download ppt "The Consistency and Conformance of Web Document Collection Based on Heterogeneous DAC Graph Marek Kopel and Aleksander Zgrzywa"

Similar presentations


Ads by Google