Ying Dai Faculty of software and information science,

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Multimedia Database Systems
Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns Lu Shi Oct. 4, 2004.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Social Cognition: Thinking About People
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Object Tracking for Retrieval Application in MPEG-2 Lorenzo Favalli, Alessandro Mecocci, Fulvio Moschetti IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR.
Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,
Multimedia Databases (MMDB)
Multimedia Information Retrieval
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Ranking and Classifying Attractiveness of Photos in Folksonomies Jose San Pedro and Stefan Siersdorfer University of Sheffield, L3S Research Center WWW.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
You Are What You Tag Yi-Ching Huang and Chia-Chuan Hung and Jane Yung-jen Hsu Department of Computer Science and Information Engineering Graduate Institute.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
A Novel Visualization Model for Web Search Results Nguyen T, and Zhang J IEEE Transactions on Visualization and Computer Graphics PAWS Meeting Presented.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Term Weighting approaches in automatic text retrieval. Presented by Ehsan.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC Relevance Feedback for Image Retrieval.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Naifan Zhuang, Jun Ye, Kien A. Hua
Chapter 3 Data Representation
Visual Information Retrieval
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Modern Information Retrieval
Automatic Video Shot Detection from MPEG Bit Stream
Klara Nahrstedt Spring 2009
Data Mining, Neural Network and Genetic Programming
Multimedia Content Based Retrieval
Neural Networks.
Multimedia Content-Based Retrieval
Chapter 25: Advanced Data Types and New Applications
Personalized Social Image Recommendation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Software Equipment Survey
Final Year Project Presentation --- Magic Paint Face
Multimedia Information Retrieval
Overview What is Multimedia? Characteristics of multimedia
Information Retrieval
Content-Based Image Retrieval
Content-Based Image Retrieval
CSc4730/6730 Scientific Visualization
Ying Dai Faculty of software and information science,
Improving DevOps and QA efficiency using machine learning and NLP methods Omer Sagi May 2018.
Topics Introduction Hardware and Software How Computers Store Data
Multimedia Information Retrieval
Judith Molka-Danielsen, Oct. 02, 2000
CSE 635 Multimedia Information Retrieval
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Ch6: AM and BAM 6.1 Introduction AM: Associative Memory
Week 7 Presentation Ngoc Ta Aidean Sharghi
Random Neural Network Texture Model
Presentation transcript:

Image/Video’s Automatic Annotation Considering Semantics’ Tolerance Relation Ying Dai Faculty of software and information science, Iwate Pref. University 2019/5/2

Outline Background class-based image/key-frame representation Semantic tolerance relation Semantic similarity Visual similarity Associative values with pre-defined classes Calculation of SS, VS and associative values Bidirectional associative memories Image/key-frame retrieval Experiments and analysis 2019/5/2

Background More and more people express themselves by sharing images and videos on line It is still hard to manipulate, index, or search on-line images and videos The objectives to our research focus on understanding image/videos’ semantics by machine representing image/videos’ semantics in a form intuitive to humans With the technological advances in digital imaging, networking, and data storage, more and more people communicate with one other and express themselves by sharing images, video and other forms of media on line. However, it is still hard to manipulate, index, filter, summarize, or search through them, because of the semantic gap between user and machine, which means that there are many queries for which visual similarity does not correlate strongly with human similarity judgments, because machine retrieves feature-similar contents, but human retrieves semantic-similar contents, following color, structural similarity . Therefore, developing systems capable of understanding images and able to represent their content in a form intuitive to humans becomes one of most motivate things in the filed of large image/video retrieval. 2019/5/2

Semantic tolerance relation of image/key-frame (1)  Image/key-frames are described by different domains, such as temporal, spacial, impression, nature vs. man-made, human vs. non-human, copyright, etc.  Divided classes can be inter-tolerated or intra-tolerated  tolerance degree is measured by two indices: semantic similarity (SS); visual similarity (VS) because the concepts of image/video in many domains are imprecise, and the interpretation of finding similar image/video is ambiguous and subjective on the level of human perception, we define the semantic categories of image and key-frame, together with the tolerance degrees between them. Images are described by different domains. For a certain domain, concepts are divided into some classes. The class may be associated with the other in a same domain. The relation between them is called as intra-association. Also, the class may be associated with the other in the different domains. The relation between them is called as inter-association. 2019/5/2

Semantic tolerance relation of image/key-frame (2) Based on the fact that people judge image similarity by semantic similarity (SS), following visual similarity (VS). Semantic tolerance relation index (STRI) two facts of SS and VS, depicted by : STRI of class i regarding dimension k to j regarding l : SS index, generated by the knowledge of subjects      : VS index, calculated based on the learned samples’ visual features All of them in the range of [0,1] 2019/5/2

Calculation of SS index Class co-occurrence matrix Regarding the number of images which are both assigned to class i and class j. Semantic similarity degree between two classes Semantic similarity tolerance index 2019/5/2

Calculation of VS index Bidirectional associative memories (BAM) (by B.Kosko) Associating tow patterns (X,Y) such that when one is encountered, the other can be recalled. Storing the associated pattern pairs by connection weight matrix 1 W11 Wmn Y X1 Xm X 2019/5/2

Calculation of VS index Bidirectional associative memories Units’ input and output in the X layer and Y layer The energy of the BAM decreases or remains the same after each unit update. BAM eventually converge to a local minimum that corresponds to a stored associated pattern pair. 2019/5/2

Calculation of VS index Stored pattern images of X layer 40 pattern images being stored for nature vs. man-made domain Some pattern Examples Food (P14) Building (P6) Restaurant (P8) landscape (P2) flower ((P3) 2019/5/2

Calculation of VS index Stored patterns of Y layer Encoding units Units’ output of Y layer for pattern image i recall values of the inputted pattern image i Stored patterns of Y layer are the encoding units of classes with I components, if I classes are pre-defined. Yi=[0…010…0] is the encoding unit of class i : the recall value of inputted pattern image i to class i 2019/5/2

Calculation of VS index Units’ input and output in Y layer for an input image Input : weighted sum of each pixel values of the inputted image Output: recall values of the inputted image VS index of class i to class j regarding dimension k : recall value of input image n to class i 2019/5/2

Some examples of VS value c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 1 0.25 0.74 0.16 0.33 0.53 0.37 0.22 0.5 0.08 0.31 2 0.21 0.28 0.24 0.02 0.2 0.14 3 0.36 0.38 0.05 0.82 0.59 0.19 0.35 0.61 0.23 4 0.26 0.3 0.94 0.48 0.15 0.09 5 0.17 0.43 0.92 0.32 0.69 6 0.34 0.77 0.66 0.8 0.11 0.65 0.01 0.52 0.06 7 0.76 0.27 0.46 0.12 0.58 0.41 0.56 8 0.1 0.57 0.39 9 0.51 0.67 0.07 10 0.03 0.84 0.18 0.13 0.68 11 0.63 0.79 12 13 0.62 0.44 0.47 14 0.81 0.87 15 0.73 0.04 …… 2019/5/2

Classed-based image/key-frame representation Each image/key-frame is represented by a vector of associative values with category i regarding domain k Generating associative values with defined classes Using BAM to generate units’ outputs of Y layer for an input image n Determining the mostly belonged class of image n Finding Assigning image n to class m where Ws, Wv: weights adjusting the weight of SS and VS in generating the associative values 2019/5/2

Classed-based image/key-frame representation Generating associative values Calculating VS degree of the image n to the class I Generating associative value Effected by SS degree and VS degree and their weights : weights 2019/5/2

Data structure for representing image/video type … Path1 Path2 Path3 1 image 0.2 0.6 0.7 Path11 2 video 0.3 0.8 Path12 Path22 Path23 k: domain’s number i: category’s number path1: image/key-frames’ location path2: shots’ location Path3: videos’ location ID: image/key-frames’ number : associative value with category i regarding domain k 2019/5/2

Image/key-frame retrieval For the image categorization regarding single domain For the image categorization regarding cross-domains For the image categorization regarding single domain , images are grouped to a class i , when the associative values of these images with class i are larger than a clustering threshold , For the image categorization regarding cross-domains, the grouped images are the intersection of those which either belong to the class i in domain k, or belong to the class j in domain l. 2019/5/2

Examples of image retrieval Some images of interior with human 2019/5/2

Performance evaluation (1) For the precision-recall of class building and interior when 30 classes were pre-defined “Using STRM” means that the SS index and VS index are considered in generating associative values of classes. “not using STRM” means that units’ output values of BAM are merely used in generating associative values of classes. interior building 2019/5/2

Performance evaluation (2) For the precision-recall of classes interior with human 2019/5/2

Performance evaluation (3) Influence when varying the number of defined classes For the precision-recall of class building 2019/5/2

Conclusion & Future work Method of representing images’ semantics by associative values of images with pre-defined classes is effective for image retrieval Bidirectional associative memories is efficient in generating associative values Performance of precision-recall is improved by considering two facts of SS and VS in generating associative values Evaluating the influence of increasing defined classes on precision-recall 2019/5/2