Download presentation
Presentation is loading. Please wait.
1
Contextual Wisdom Social Relations and Correlations for Multimedia Event Annotation Amit Zunjarwad, Hari Sundaram and Lexing Xie
2
I don’t want to spend time annotating :( help! June 2, 2015@NUS2
3
Talk Outline Observations Events Generalization: Sum of Partial Observations Similarity, Co-Occurrence and Trust @I2RJune 2, 20153 Experiments: compare against SVM Conclusions
4
An Annotation Puzzle @NUSJune 2, 20154
5
5@NUS Observing Flickr Data
6
The pool statistics reveal a power law distribution Less than 11% of the tags have more than 10 photos There are not enough instances to learn most of the concepts! The global flickr pool is interesting: June 2, 20156@NUS Learnability
7
June 2, 20157@NUS Learnability
8
The pool statistics reveal a power law distribution Less than 11% of the photos have more than 10 instances There are not enough instances to learn most of the concepts! The global flickr pool is interesting: Most of the tags have over 100 instances Photos reveal very high visual diversity The Power law is a fundamental property of online networks – cannot be wished away. June 2, 20158@NUS Learnability
9
Singapore People Walking Orchard rd. After MRT Experimenting Walking Day Outdoor.. June 2, 20159@NUS Scalability
10
The assumption of consensual semantics Search for “yamagata” June 2, 201510@NUS The Role of context
11
June 2, 2015@NUS11 What if the answer didn’t completely lie in the pixels?
12
Events What are they? June 2, 2015@NUS12
13
An event refers to a real-world occurrence, spread over space and time. Observations form event meta data [Westermann / Jain 2007] Images / text / sounds describe events June 2, 201513@NUS Defining Events when where who what author image
14
Event context refers to the set of attributes that help in understanding the semantics Images / Who / Where / When / What / Why / How Context is always application dependent Ubiquitous computing community – location, identity and time are main considerations June 2, 201514@NUS Context [Mani and Sundaram 2007]
15
Event archival – events involve people, places and artifacts Exploit different forms of knowledge: (Global) Similarity – media, events, people. (Personal) Co-occurrence – what are the joint statistics of occurrence? (Social) Trust – determining whom to trust for effective annotation? June 2, 201515@NUS Four Problems
16
Similarity Global, Systemic knowledge June 2, 2015@NUS16
17
A bottom up approach Edge, color and texture histograms for images. Rely on ConceptNet for text tags Why ConceptNet and not WordNet? Expands on pure lexical terms, to compound terms – “buy food” Expands on number of relations – from three to twenty Contains practical knowledge – we can infer that a student is near a library. June 2, 201517@NUS Event similarity
18
ConceptNet provides three functions: GetContext(node): the neighborhood of the concept “book” includes “knowledge”, “library” GetAnalogousConcepts(node): concepts that share incoming relations; analogous concepts for the concept “people” are “human”, “person”, “man” FindPathsBetweenNodes(node 1,node 2 ) – returns a set of paths. Our similarity measure is built using these functions. June 2, 201518@NUS A base similarity measure
19
The similarity between two concepts (e,f) is defined as follows: We current use a uniform weighting on all three as the composite measure June 2, 201519@NUS Concept similarity context analogous path based
20
The distance between two concept sets is a modified Haussdorf similarity. June 2, 201520@NUS Computing similarity between sets A B
21
Similarity between facets are computed using a weighted sum of frequency and the concept similarity measure: Time distance is based on text tags, not actual time data – allows for temporal descriptions as “summer”, “holidays” etc. Only frequency is used for “who” facet. June 2, 201521@NUS Facet similarity (4w)
22
Color, texture and edges are computed 166 bin HSV color histogram 71 bin edge histogram 3 texture features Euclidean distance on the composite feature vector. The distance between two events is then a weighted sum of distances across all event facets. June 2, 201522@NUS Image facet similarity
23
June 2, 201523@NUS The global similarity matrix M s
24
Co-occurrence Personal, statistical knowledge June 2, 2015@NUS24
25
The concept co-occurrences are just frequency counts. (i= fun, j = new york) then the index (i,j) contains the number of occurrences of this tuple. Notes: Each concept is given a globally unique index Co-occurrence matrixes are locally compact Each user k, has a co-occurrence matrix M c k associated with the user. June 2, 201525@NUS Statistics are computed per person
26
Trust People we like June 2, 2015@NUS26
27
Narrow understanding of “trust” a priori value is important Computing trust: Compute event-event similarity Trust propagation Biased PageRank algorithm Trust vectors are row normalized June 2, 201527@NUS Activity based trust
28
The recommendation algorithm June 2, 2015@NUS28
29
The framework is event centric We know: How to combine the three? June 2, 201529@NUS A review of what we know similarity co-occurrence trust vectors global personal social
30
1.Compute the social network trust vector (t) for the current user. 2.Compute the trusted, global co-occurrence matrix, for all tuples. 3.Iterate: June 2, 201530@NUS details whowherewhatwhenimageevent query
31
Experiments June 2, 2015@NUS31
32
Developed and event based archival system 8 graduate students 58 events, 250 images, over two weeks SVM – baseline comparison Two cases Uniform trust (global) Personal trust June 2, 201532@NUS Details
33
Training is difficult – very small pool. Modified bagging strategy Train five symmetric classifiers Pick one which maximizes the F-score June 2, 201533@NUS SVM training
34
Global Case: 31 classifiers (who:8, when: 6, where: 10, what: 7) Minimum number of images: 10 Tested on 50 images (why?) June 2, 201534@NUS Uniform trust FacetsSVMCM (uniform) HMXUHM Who1323592228 When11206132426 Where12193162327 What1321883119 Event1012226 28 HHits MMisses XUnknown UUndecidable
35
Trained classifiers per person Very small pool Min images – 5 28 classifiers (who:9, when: 4, where: 6, what: 9) June 2, 201535@NUS Personal Network FacetsSVMCM (network) HMXUHM Who458162 18367 When5196733016783 Where6276595317971 What7289236620446 Events00250015397 HHits MMisses XUnknown UUndecidable
36
June 2, 201536@NUS Positive examples SVM ‘sky diving’ Social Network based method ‘fun’
37
The Sum of Partial Observations Beyond web 2.0 hype June 2, 2015@NUS37
38
Which media object summarizes “my trip to Singapore?” June 2, 201538@NUS Experiential fragments
39
June 2, 201539@NUS A reconsideration of a traditional idea
40
@NUS The Creation of participatory knowledge June 2, 201540
41
Conclusions June 2, 2015@NUS41
42
An event based annotation system Media are event meta-data Issues: learnability, scalability, context Employ three kinds of knowledge Global – conceptnet, image similarity Personal – statistical co-occurrence Social – trust Recommendations Employ iterative schemes (HITS / PageRank) Results: Outperform SVM in small pools June 2, 201542@NUS summary
43
Power law tag distribution Data pool will remain small for most tags Fundamental issue Participatory knowledge is powerful – trust within context is important issue. Future work: Careful math analysis of coupling equations Event structure / relationships need to be incorporated Multi-source (email / Calendar / IM / blogs) integration. June 2, 201543@NUS Conclusions
44
Thanks! Esp. Dick Bulterman, Mohan June 2, 2015@NUS44
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.