Download presentation
Presentation is loading. Please wait.
Published byJulie Lawrence Modified over 9 years ago
1
© December 1999 George Paliouras, All Rights Reserved1 Learning Communities of Users on the Internet George Paliouras Christos Papatheodorou Vangelis Karkaletsis Constantine D. Spyropoulos NCSR “Demokritos” Email: paliourg@iit.demokritos.grpaliourg@iit.demokritos.gr WWW: http://www.iit.demokritos.gr/skelhttp://www.iit.demokritos.gr/skel
2
© December 1999 George Paliouras, All Rights Reserved2 Structure of the talk Services on the Internet Personalization of Internet services Learning communities from usage data Three case studies –Information broker(filtering) –Digital library(retrieval) –Web-site(navigation) Conclusions
3
© December 1999 George Paliouras, All Rights Reserved3 WWW: the new face of the Net Once upon a time, the Internet was a forum for exchanging information. Then … …came the Web. The Web introduced new capabilities … …and attracted many more people … …increasing commercial interest … …and turning the Net into a real forum …
4
© December 1999 George Paliouras, All Rights Reserved4 Services on the Internet Information providers are still the majority… CommercialNon-Commercial CNNReuters TimesYahoo CORDISNCSTRL MLNETLibrary
5
© December 1999 George Paliouras, All Rights Reserved5 We have looked at three different types: –Information filtering (profiles) –Information retrieval (queries) –Navigation Information access The Web has introduced new ways to access information. passive active … covering the majority of today’s information services.
6
© December 1999 George Paliouras, All Rights Reserved6 Personalized information access Adaptation of the system to the user. Social motivation: –Better service for the citizen (reduction of the information overload). Commercial motivation: –Customer relationship management (targeted advertisement, customer retention, increased sales, etc.)
7
© December 1999 George Paliouras, All Rights Reserved7 Personalized information access “The Quantity of People Visiting Your Site Is Less Important Than the Quality of Their Experience” Evan I. Schwartz, Webonomics, Broadway Books, 1997
8
© December 1999 George Paliouras, All Rights Reserved8 Personalized information access sources server receivers
9
© December 1999 George Paliouras, All Rights Reserved9 User modeling The process of constructing models that can be used to adapt the system to the user’s requirements. Types of user requirement: –Interests (e.g. sports and finance articles) –Knowledge level (e.g. novice – expert) –Preferences (e.g. appearance of GUI) –etc.
10
© December 1999 George Paliouras, All Rights Reserved10 User Models User model (type A): [PERSONAL] User x -> sports, stock market User model (type B): [PERSONAL] User x, Age 26, Male -> sports, stock market User community: [GENERIC] Users {x,y,z} -> sports, stock market User sterotype: [GENERIC] Users {x,y,z}, Age [20..30], Male -> sports, stock market
11
© December 1999 George Paliouras, All Rights Reserved11 Machine Learning / Data Mining Acquisition of models from usage data. Types of learning –Supervised learning: requires manually tagged examples. –Unsupervised learning: clusters untagged examples, according to similarity.
12
© December 1999 George Paliouras, All Rights Reserved12 Learning user models User 1 User 2User 3 User 4 User 5 Observation of the users interacting with the system. User models Community 1 Community 2 User communities
13
© December 1999 George Paliouras, All Rights Reserved13 Collaborative filtering Memory-based “learning”, (e.g. k-nn): –Given a group of users… –…and a new user… –…find similar users. Already in commercial use (e.g. Firefly, amazon.com ) Problem: It does not give any insight about the usage of the system.
14
© December 1999 George Paliouras, All Rights Reserved14 Clustering users into communities Clustering methods: –Conceptual clustering (COBWEB, ITERATE) –Graph-based clustering (Cluster mining) –Statistical clustering (Autoclass) –Neural clustering (Self-organising Maps)
15
© December 1999 George Paliouras, All Rights Reserved15 Conceptual clustering COBWEB generates a hierarchy of concepts. Each concept is a cluster of objects. Our concepts are the communities. Our objects are “user models”. Similarity metric: category utility. Each user in only one community.
16
© December 1999 George Paliouras, All Rights Reserved16 Meaningful communities Question: What are the characteristics of a community? Answer: Community characterization, measuring frequency increase. Example: How frequently do users of the community read sports news, compared to the whole set of users.
17
© December 1999 George Paliouras, All Rights Reserved17 Cluster mining Searches for cliques in a graph of the following form: hardware mathematics of computing software computing milieux computing methodologies 0.22 0.120.27 0.19 0.13 0.024 0.03 0.04 0.02 0.03 0.0140.0262 0.02
18
© December 1999 George Paliouras, All Rights Reserved18 Cluster mining Nodes: features in the user model. Edge labels: frequency at which the two nodes appear together in the data. Edge reduction: using a threshold. Clique: commonly met pattern in the behavior of the users. Each user in several communities.
19
© December 1999 George Paliouras, All Rights Reserved19 Case studies Information broker (filtering) Digital Library (retrieval) Web-site (navigation) ACAI99 NCSTRL ?
20
© December 1999 George Paliouras, All Rights Reserved20 Criteria for the communities We evaluate the quality of community descriptions (behavioral patterns), by: –Coverage: Proportion of characteristics appearing in the descriptions. –Overlap: Extend of overlap between descriptions: –Meaningfulness: Do the descriptions make sense? Are they interesting?
21
© December 1999 George Paliouras, All Rights Reserved21 I: Profile-based filtering User models: profiles of news categories for each user. User communities: users with common news-reading interests. Community descriptions: news categories for each community.
22
© December 1999 George Paliouras, All Rights Reserved22 I: COBWEB A(1078) B(681) C(397) D(328)E(353)F(98) G(181)H(118) J (104) K (161) L (95) M (102) N (156) O (38) P (17) Q (43) R (36) S (96) I (63) W (28) V (62) U (28) T (49) Community hierarchy
23
© December 1999 George Paliouras, All Rights Reserved23 I: COBWEB CoverageOverlap
24
© December 1999 George Paliouras, All Rights Reserved24 I: COBWEB D EInternet (0.55) FEconomic ind. (0.73), Economics & Finance (0.68), Computers (0.6), Transport (0.53), Financial ind. (0.5) GEconomic ind. (0.58), Economics & Finance (0.61) HComputers (0.53) Community descriptions
25
© December 1999 George Paliouras, All Rights Reserved25 I: Cluster mining Behavioral patterns Telecom, Computers, Internet, Industries, Economics/Finance Telecom, Computers, Networks Telecom, Economic ind., Economics/Finance Hardware, Software Financial ind., Economic ind., Economics/Finance Financial ind., Economic ind., Financial markets Sport, Entertainment electronics
26
© December 1999 George Paliouras, All Rights Reserved26 I: Comparison
27
© December 1999 George Paliouras, All Rights Reserved27 II: Query-based retrieval User models: processed queries. User communities: user queries with common keywords. Community descriptions: characteristic keywords for each community. Pre-processing: –Lemmatization and synonyms (WordNet). –Generalization to top ACM categories.
28
© December 1999 George Paliouras, All Rights Reserved28 II: COBWEB Community descriptions Computer Systems Organisation (1.0) Software (1.0) Hardware (1.0) Information Systems (1.0), Computing milieux (0.63), Computing methodologies (0.28) Information Systems (1.0) Computing methodologies (1.0), Hardware (1.0) Computing methodologies (1.0), Software (1.0) Computing methodologies (1.0)
29
© December 1999 George Paliouras, All Rights Reserved29 II: Cluster mining Behavioral patterns Hardware, Software, Computing Milieux, Computing Methodologies Hardware, Software, Computing Milieux, Maths of Computing Hardware, Computer Systems Organisation Theory of Computation, Maths of Computing Information Systems, Software, Computing Milieux, Computing Methodologies Information Systems, Software, Computing Milieux, Maths of Computing
30
© December 1999 George Paliouras, All Rights Reserved30 III: Web-site navigation User models: access sessions as sets of pages or sets of page transitions. User communities: users with common navigation behavior. Community descriptions: Pages or page transitions for each community. Pre-processing: –Sessions from access logs. (duration) –Dimensionality reduction, by feature selection.
31
© December 1999 George Paliouras, All Rights Reserved31 III: COBWEB Community descriptions 24 > 25, 23 > 24, 1 > 24, 1 > 19, 19 > 23 1 > 22, 22 > 20, 20 > 31, 31 > 27, 27 > 7, 19 > 23 22 > 31, 1 > 22 22 > 27, 1 > 22 1 > 30 1 > 30, 8 > 1, 1 > 8 30 > 31, 1 > 30
32
© December 1999 George Paliouras, All Rights Reserved32 III: Cluster mining Behavioral patterns 1>19, 19>23, 23>24, 24>25 1>24, 24>25 1>22, 22>31 1>22, 22>20 1>30, 30>31 22>20, 20>31, 31>27 22>20, 20>27 1>8, 8>1 1>9, 9>2 20>31, 31>27, 27>7 19>23, 23>14 23>14, 27>7 1>2, 2>11 2>11, 11>12 1>23, 23>24
33
© December 1999 George Paliouras, All Rights Reserved33 Conclusions Community construction gives insight about the usage of information services. Unsupervised learning can do the job. Characterization makes the results useful. Substantial data engineering is need for different types of information access.
34
© December 1999 George Paliouras, All Rights Reserved34 A paradox High commercial demand for a research product! Solutions need to be simple and efficient!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.