Presentation is loading. Please wait.

Presentation is loading. Please wait.

Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks Roman Y. Shtykh Waseda University, Japan.

Similar presentations


Presentation on theme: "Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks Roman Y. Shtykh Waseda University, Japan."— Presentation transcript:

1 Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks Roman Y. Shtykh Waseda University, Japan

2 Information Need as a Driving Force of Human Information Behaviour Recognition of one’s knowledge inadequacy to satisfy a particular goal (Case, 2002) “consciously identified gap” in one’s knowledge (Ingwersen and Jarvelin, 2005) How can the system be user-centric and satisfy sufficiently the user’s information need without knowing it?

3 Context Information need emerges in one’s individual context, and both context and information need are evolving over time Information behaviours happening to satisfy the information need and leading to an information object selection also take place in the same particular context

4 Context of “fragmentary” nature

5 BESS (BEtter Search and Sharing) Framework for collaborative information seeking and sharing. Uses uniform relevance feedback to infer user interests changing over time and – Use the knowledge about the interests to better satisfy seeking intents by providing information that is likely to match inferred user interests – PERSONALIZED SEEKING to evaluate shared information based on his/her interests (expertise) – PERSONALIZED SHARING

6 Profile Structure l – layer, k – concept number

7 Concept Formation with H2S2D (High Similarity Data-Driven) Clustering Online incremental clustering method for relevance feedback sequential data. Based on the peculiarities of a user’s seeking behavior (ASSUMPTIONS in the next slide).

8 Assumptions When a user searches, he/she usually sees (clicks on, focuses the attention on, etc.) several documents (links or other objects) until the most relevant is found. Most of these documents are potentially inter-similar to some extent and can give a conception about a particular user interest. Even if some similar documents are not sequenced till the present moment, there are documents related to the persistent user interests and re-searches on these interests are likely to occur. In these cases a user either clicks on the links he/she found before or on the links leading to the documents highly similar to those found before.

9 Assumptions (1) Relevance Feedback Feedback – sequentially-incoming uniform data S with subsequences of n (more than one) or more highly similar items linked through by a particular information need S = S 1 S 2 …S n … and can be considered potentially new semantic clusters (concepts).

10 Assumptions (2) Relevance Feedback Items not coming in high-similarity subsequences are still considered as potentially related to user interests, but since they are not much useful for profiles they are put into a candidate pool to be retrieved and used for concept formation later when a subsequence of similar feedback data items is observed.

11 Assumptions User Study. Assumption 1: subsequence percentage 12 users, two weeks S th = 0.05S th = 0.1

12 Assumptions User Study. Assumption 2: percentage of re-accessed and all inter-similar documents S th =0.05S th =0.1S th =0.2 83%74%57%

13 H2S2D Algorithm (1) Online incremental Unsupervised Key features: a new cluster definition relies upon sequential characteristics of relevance feedback; assignment of an incoming data item is delayed if there is no similar enough cluster, and performed when such a cluster is created.

14 H2S2D Algorithm Evaluation Results (Reuters collection) H2S2DECM S th ItemsAccPF PF 0.05 5000.780.560.620.800.500.56 10000.830.500.580.810.470.49 20000.860.52 0.820.500.52 ………………… 70000.890.480.490.850.470.49 80000.890.490.500.860.470.50 0.1 5000.950.610.640.890.610.57 10000.900.640.600.850.520.45 20000.900.650.540.850.540.44 ………………… 70000.950.56 0.850.580.39 80000.950.57 0.860.600.38 0.2 5000.990.950.940.840.760.30 10000.910.930.890.850.730.30 20000.910.820.790.850.710.26 ………………… 70000.930.810.700.840.670.20 80000.930.810.700.830.680.19

15 H2S2D Algorithm Evaluation Results (Reuters collection) Items S th =0.05S th =0.1S th =0.2 H2S2DECMH2S2DECMH2S2DECM 50054810548 10006411131061 20009413141174 30009414161284 400011520212093 500013622 21102 6000146242322113 700014625 24117 8000146252724123 Number of clusters created after n items are processed

16 H2S2D Algorithm Evaluation Results (Reuters collection) Ratio of candidate items to the number of processed items

17 Role and Position of User Profile in BESS

18 Interest-change-driven Profile Construction Construction criteria: Recency Frequency Persistency

19 Profile Structure l – layer, k – concept number

20 Profile Construction Session Layer Latest created or updated concept from C a = {C a1, …, C an } of user a Recency

21 Profile Construction Short-term Layer m most frequently updated and used concepts, which are, in their turn, chosen from r most recent (top) concepts in the concept recency list. Recency and Frequency

22 Profile Construction Long-term Layer derived from the concepts of the short-term layer which were most frequently observed as the short-term layer’s components. Persistency.

23 Profile Construction An example (1) Short-term layer 5187634 Firefoxreal estatesoftware Japanese newsconferencemassagetravel customize square meterscomputercompanyworkshopTokyotour pluginsarea businessmobile Malta colored scrollbar politicsmultimedia ticket browsers financeIR hotel

24 Profile Construction An example (2) Long-term layer 5187634 Firefoxreal estatesoftware Japanese newsconferencemassagetravel customize square meterscomputercompanyworkshopTokyotour pluginsarea businessmobile Malta colored scrollbar politicsmultimedia ticket browsers financeIR hotel

25 Conclusions (1) * In order to implement user-centric services, knowledge about a user’s information need (IN) is needed. * IN is something that cannot be captured (at least with today’s advances in human sciences) * An attempt to obtain “fragmentary” context can be done to further facilitate a user’s activities

26 Conclusions (2) * We proposed a multi-layered user modelling approach to dynamically organise and update a user’s contextual information according to its volatility and persistency characteristics. * We proposed Similarity Sequence Data-Driven clustering algorithm for concept construction. In spite of its relative simplicity, the proposed H2S2D method has demonstrated reasonably good clustering results in terms of accuracy and precision and has proved to be suitable for fast real-time relevance feedback processing to guarantee always-updated concepts.


Download ppt "Capturing User Contexts: Dynamic Profiling for Information Seeking Tasks Roman Y. Shtykh Waseda University, Japan."

Similar presentations


Ads by Google