Download presentation
Presentation is loading. Please wait.
Published byLillian Perkins Modified over 9 years ago
1
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Taxonomies Discovering the Structure of Information Tim Weninger Department of Computer Science University of Illinois Urbana-Champaign, Urbana, IL
2
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Information wants to be free World Wide Web is decentralized and messy. ›(but it wants to be structured) Taxonomies are used to describe hierarchical structure of data ›Almost always hand crafted Data is made (forced) to fit the taxonomy Information wants to be free!
3
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Information wants structure Just like political science… in data science… There is no such thing as digital anarchy ›Government will always rise Data democracy ›Let the data decide its own form government
4
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Let’s discover a taxonomy of a Web site
5
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph Web Tree – is a really hard problem How do we traverse the graph? ›BFS ›DFS ›MST ›With Replacement ›Without Replacement ›All links ›Some links
6
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph Web Tree? – BFS
7
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Web Graph Web Tree Lists of links ›WWW2011 work Link paths? Most probable user navigation ›PageRank We’re working on all of those – PageRank seems to work
8
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Some explorations – BM25 ranks text
9
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Propagate information backwards – re-rank
10
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Map taxonomies Assumption ›Two taxonomies from Web sites of similar organizational missions will be similar Lets do integration
11
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Some early results
12
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Brand new result --- Breakthrough this morning Cue scary graphs
13
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CCICADA 2012 Meeting March 30, 2012 Questions? Challenges?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.