Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Taxonomies Discovering the Structure.

Similar presentations


Presentation on theme: "Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Taxonomies Discovering the Structure."— Presentation transcript:

1 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Taxonomies Discovering the Structure of Information Tim Weninger Department of Computer Science University of Illinois Urbana-Champaign, Urbana, IL

2 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Information wants to be free World Wide Web is decentralized and messy. ›(but it wants to be structured) Taxonomies are used to describe hierarchical structure of data ›Almost always hand crafted Data is made (forced) to fit the taxonomy Information wants to be free!

3 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Information wants structure Just like political science… in data science… There is no such thing as digital anarchy ›Government will always rise Data democracy ›Let the data decide its own form government

4 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Let’s discover a taxonomy of a Web site

5 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph  Web Tree – is a really hard problem How do we traverse the graph? ›BFS ›DFS ›MST ›With Replacement ›Without Replacement ›All links ›Some links

6 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph  Web Tree? – BFS

7 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph  Web Tree Lists of links ›WWW2011 work Link paths? Most probable user navigation ›PageRank We’re working on all of those – PageRank seems to work

8 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Some explorations – BM25 ranks text

9 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Propagate information backwards – re-rank

10 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Map taxonomies Assumption ›Two taxonomies from Web sites of similar organizational missions will be similar Lets do integration

11 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Some early results

12 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Brand new result --- Breakthrough this morning Cue scary graphs

13 Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Questions? Challenges?


Download ppt "Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Taxonomies Discovering the Structure."

Similar presentations


Ads by Google