Ying Dai Faculty of software and information science,

Image/Video’s Semantic Assignment Using Bidirectional Associative Memories
Ying Dai Faculty of software and information science, Iwate Pref. University 2019/2/23

Outline Background class-based image/key-frame representation
Semantic similarity Visual similarity Associative values with pre-defined classes Calculation of SS, VS and associative values Bidirectional associative memories Image/key-frame retrieval Experiments and analysis 2019/2/23

Background More and more people express themselves by sharing images and videos on line It is still hard to manipulate, index, or search on-line images and videos The objectives to our research focus on understanding image/videos’ semantics by machine representing image/videos’ semantics in a form intuitive to humans With the technological advances in digital imaging, networking, and data storage, more and more people communicate with one other and express themselves by sharing images, video and other forms of media on line. However, it is still hard to manipulate, index, filter, summarize, or search through them, because of the semantic gap between user and machine, which means that there are many queries for which visual similarity does not correlate strongly with human similarity judgments, because machine retrieves feature-similar contents, but human retrieves semantic-similar contents, following color, structural similarity . Therefore, developing systems capable of understanding images and able to represent their content in a form intuitive to humans becomes one of most motivate things in the filed of large image/video retrieval. 2019/2/23

Category similarity correlation of image/key-frame (1)
　Image/key-frames are described by different domains, such as temporal, spacial, impression, nature vs. man-made, human vs. non-human, copyright, etc. 　Divided classes can be inter-correlation or intra-correlation 　the correlation is measured by two indices: semantic similarity (SS); visual similarity (VS) because the concepts of image/video in many domains are imprecise, and the interpretation of finding similar image/video is ambiguous and subjective on the level of human perception, we define the semantic categories of image and key-frame, together with the tolerance degrees between them. Images are described by different domains. For a certain domain, concepts are divided into some classes. The class may be associated with the other in a same domain. The relation between them is called as intra-association. Also, the class may be associated with the other in the different domains. The relation between them is called as inter-association. 2019/2/23

Category similarity correlation of image/key-frame (2)
Based on the fact that people judge image similarity by semantic similarity (SS), following visual similarity (VS). Category similarity correlation indices two facts of SS and VS, depicted by : SS index, generated by the knowledge of subjects 　　　　　: VS index, calculated based on the learned samples’ visual features All of them in the range of [0,1] 2019/2/23

Generation of SS correlation value
Class co-occurrence matrix Regarding the number of images which are both assigned to class i and class j. Semantic similarity degree between two classes Semantic similarity correlation index 2019/2/23

Generation of VS correlation value
Bidirectional associative memories (by B.Kosko) Associating tow patterns (X,Y) such that when one is encountered, the other can be recalled. Storing the associated pattern pairs by connection weight matrix 1 W11 Wmn Y X1 Xm Ｘ 2019/2/23

Bidirectional associative memories Input and output in the X layer and Y layer The energy of the BAM decreases or remains the same after each unit update. BAM eventually converge to a local minimum that corresponds to a stored associated pattern pair. 2019/2/23

Stored pattern images of X layer (examples, nature vs. man-made domain)
Football (P1) Landscape (P2) flower (P3) Tree (P4) Flower (painting ) (P5) Food (P14) Building (P6) Restaurant (P8) Wood (P10) Paper (P11) 2019/2/23

Stored patterns of Y layer
Encoding units Particular unit outputs (recall values) Stored patterns of Y layer are the vectors of I components which reprsent the encoding units of classes. The vector has I components if I classes are pre-defined. 2019/2/23

VS correlation index Let a pattern image represent a class, and I classes are pre-defined. For a class i with a pattern image i, VS to the class j 2019/2/23

Some examples of VS correlation value
1 0.25 0.74 0.16 0.33 0.53 0.37 0.22 0.5 0.08 0.31 2 0.21 0.28 0.24 0.02 0.2 0.14 3 0.36 0.38 0.05 0.82 0.59 0.19 0.35 0.61 0.23 4 0.26 0.3 0.94 0.48 0.15 0.09 5 0.17 0.43 0.92 0.32 0.69 6 0.34 0.77 0.66 0.8 0.11 0.65 0.01 0.52 0.06 7 0.76 0.27 0.46 0.12 0.58 0.41 0.56 8 0.1 0.57 0.39 9 0.51 0.67 0.07 10 0.03 0.84 0.18 0.13 0.68 11 0.63 0.79 12 13 0.62 0.44 0.47 14 0.81 0.87 15 0.73 0.04 …… 2019/2/23

Image single-class assignment
Same as calculating VS correlation values, Using bidirectional associative memories to assign image to single-class Assignment criteria (L classes being pre-defined) 2019/2/23

Classed-based image/key-frame representation
Each image/key-frame is represented by a vector of associative values with category i　regarding domain k where Ws, Wv: weights adjusting the weight of SS and VS in generating the associative values : weights 2019/2/23

Data structure for representing image/video
type … Path1 Path2 Path3 1 image 0.2 0.6 0.7 Path11 2 video 0.3 0.8 Path12 Path22 Path23 k: domain’s number i:　category’s number path1: image/key-frames’ location path2： shots’ location Path3: videos’ location ID： image/key-frames’ number : associative value with category i regarding domain k 2019/2/23

Image/key-frame retrieval
For the image categorization regarding single domain For the image categorization regarding cross-domains For the image categorization regarding single domain , images are grouped to a class i , when the associative values of these images with class i are larger than a clustering threshold , For the image categorization regarding cross-domains, the grouped images are the intersection of those which either belong to the class i in domain k, or belong to the class j in domain l. 2019/2/23

Performance evaluation (1)
Influence when varying the number of defined classes For the precision-recall of class building 2019/2/23

Influence when varying the resolution of stored pattern image For the precision-recall of class building while 30 classes were pre-defined 2019/2/23

Influence when varying the weight values For the precision-recall of class building when 30 classes were pre-defined 2019/2/23

Conclusion & Future work
Method of representing images’ semantics by associative values of images with pre-defined classes is effective for image retrieval Generating associative values by considering two facts of SS and VS is effective Bidirectional associative memories is efficient in generating associative values Evaluating the influence of increasing defined classes on precision-recall 2019/2/23

Ying Dai Faculty of software and information science,

Similar presentations

Presentation on theme: "Ying Dai Faculty of software and information science,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ying Dai Faculty of software and information science,

Similar presentations

Presentation on theme: "Ying Dai Faculty of software and information science,"— Presentation transcript:

Similar presentations

About project

Feedback