Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Intelligence

Similar presentations


Presentation on theme: "Artificial Intelligence"— Presentation transcript:

1 Artificial Intelligence
Project 2 : Cross-modal Generation between Image and Text using Hypernetworks Prepared by Kim, Byoung-Hee and Ko, Younggil Presented by Heo, Min-Oh Biointelligence laboratory

2 (C) 2009, SNU Biointelligence Laboratory
Contents Overview Theme of the Project: MMG Task Description Data set Guide to Writing Reports Style, mandatory contents, optional contents Submission guide / Marking scheme Brief guide to the MMG tool (C) 2009, SNU Biointelligence Laboratory

3 (C) 2009, SNU Biointelligence Laboratory
Overview Goal Understand Hypernetworks & AI deeper Practice research and technical writing Multimodal Memory Game (MMG) Simulation of recall memory & cross-modal matching (of human) Image-to-text (I2T) & text-to-image(T2I) generation Data Set (Screenshot, sentence) pairs from ‘Lost’ (American TV drama) (C) 2009, SNU Biointelligence Laboratory

4 Ultimate Goal: Human Level AI
Creative Adaptive Many-Talented Friendly / Social Uncertain Careless Emotional Non-logical 1 + 2 = 5 ! 100 < 10 ? 수정했음. To reach ‘Human-Level Intelligence’, we need to imitate/reproduce various human attributes © 2009, SNU Biointelligence Lab,

5 Molecular Self-assembly and Cognitive Associative Memory DNA Hypernetworks: Self-assembly based cognitive memory Molecular Structure (DNA) Molecular Recognition Self Assembly Byoung-Tak Zhang, Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory, IEEE Computational Intelligence Magazine, 3(3): 49-63, August 2008.

6 Review of the Lecture on Hypernetworks
(C) 2009, SNU Biointelligence Laboratory

7 Theme of the Project: Multimodal Memory Game (MMG)
The game consists of a machine learner and two or more human learners in a digital cinema All the participants including the machine watch the movies. After watching, the humans play the game by question-and-answering about the movie scenes and dialogues. There are two human players, called I2T and T2I. The task of player I2T (forimage-to-text) is to generate a text given a movie cut (image). Player T2I (for text-to-image) is to generate an image given a text from the movie captions. one machine learner (learning by viewing) watching video movies in a digital cinema. The goal of the machine learner is to perform crossmodal translation, e.g., generating a sentence given an image out of the movie or vice versa. The machine learner gets hints from the human learners playing the game by asking questions and answering them in different modalities. (C) 2009, SNU Biointelligence Laboratory

8 MMG : Text-to-Image Case
(C) 2009, SNU Biointelligence Laboratory

9 © 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/
One of the Motivations of MMG: Video Search based on Vision-Word Crossmodal Information © 2009, SNU Biointelligence Lab,

10 Example: Text-to-Image Search
Query Extracted images Extracted Video Patch Selected images © 2009, SNU Biointelligence Lab,

11 Example: Image-to-Text Search
Query Extracted Texts Extracted Video Patch © 2009, SNU Biointelligence Lab,

12 Another Motivation of MMG
In the point-of-view of Cognitive Science: Imitating recall(memory) of human brain When discussing memory, recall is the act of retrieving from long term memory a specific incident, fact or other item. Three types of recall Free recall: when no clues are given to assist retrieval Serial recall: when items are recalled in a particular order Cued recall: when some clues are given to assist retrieval (C) 2009, SNU Biointelligence Laboratory

13 Training Hypernetworks for MMG
Initialization by sampling (refer the next slide) for each (sentence, image) pair in the training set In the case of image-to-text (I2T): generate a sentence based on the given image In the case of text-to-image (T2I): generate an image based on the given text Evaluation of the generation ability as a recall memory: compare generated results with the original Correctly matched hyperedge: increase its weight Incorrectly matched hyperedge: remove it and add a newly sampled hyperedge Iteration Step B ‘epoch’ times (C) 2009, SNU Biointelligence Laboratory

14 Initialization of Hypernetworks by Random Sampling from Dataset
xi1=1 Image Text xi4=1 xi2=0 xi3=1 xi(n-3)=0 xi(n-1)=0 xi(n-2)=1 xin=0 can you we help to house the …… hyperedge1 xiy2=0 xiy1=1 Xiyn-1=1 xiyn=0 hyperedge2 xiz2=1 xizn-1=0 xizn=0 hyperedge3 xik2=1 xik1=1 Xikn-1=0 xikn=1 xiz1=0 Randomly selected pixels text order image order

15 (C) 2009, SNU Biointelligence Laboratory
Tasks for the Project Build hypernetworks that do T2I/I2T generation using given dataset Check the effects of the parameters for training hypernetworks Order of hyperedges Learning rate Sampling (C) 2009, SNU Biointelligence Laboratory

16 (C) 2009, SNU Biointelligence Laboratory
Data Set Dataset preparation 349 pairs of image & sentence from ‘Lost’ Each sentence was translated to integer form based on the dictionary file (text.txt, dic.txt) “This is not even a date”  “33,34,35,36,27,37” (C) 2009, SNU Biointelligence Laboratory

17 (C) 2009, SNU Biointelligence Laboratory
Data Set (cont’d) Each screenshot has been converted to 80 by 60 size b/w bitmap image. One image per one line in the data file (image.txt) 1,0,1,0,0,0,0,1,1,1,1,1,1,0,0,0,…… (C) 2009, SNU Biointelligence Laboratory

18 Report Contents – Mandatory
System description Used software and running environments Basic experiments Text2Image: try various ‘epoch’ values and check the (subjective or objective) quality of the generated images Image2Text: set ‘epoch=1’, and check the quality of generated sentences while increasing ‘sampling count’ Analysis & discussion Text2Image: analysis & discussion about the relation between ‘epoch’ and resulting image set (C) 2009, SNU Biointelligence Laboratory

19 Report Contents – Optional
Analysis on the effect of various parameters Idea/suggestion about the way of learning for cross-modal generation Idea/suggestion about the application of MMG (C) 2009, SNU Biointelligence Laboratory

20 Reports Style English only, Scientific journal-style
How to Write A Paper in Scientific Journal Style and Format  Experimental process  Section of Paper What did I do in a nutshell?  Abstract  What is the problem? Introduction  How did I solve the problem?  Materials and Methods  What did I find out?  Results  What does it mean?  Discussion  Who helped me out?  Acknowledgments (optional)  Whose work did I refer to?  Literature Cited  Extra Information Appendices (optional) (C) 2009, SNU Biointelligence Laboratory

21 (C) 2009, SNU Biointelligence Laboratory
Submission Guide Due date: December 2, 18:00 Submit both ‘hardcopy’ and ‘ ’ Hardcopy submission to the office ( ) submission to Subject : [AI Project2 Report] Student number, Name Length: report should be summarized within 12 pages. If you build a program by yourself, submit the source code with comments Objective: NOT the accuracy and your programming skill, but your creativity and research ability. Individual project! You have to do it by yourself. (C) 2009, SNU Biointelligence Laboratory

22 (C) 2009, SNU Biointelligence Laboratory
Marking Scheme 40 points for experiment & analysis Extra 3 points per additional experiment 20 points for the report 6 points for overall organization Late work (- 10%) per one day (-8 points) Maximum 7 days (C) 2009, SNU Biointelligence Laboratory

23 (C) 2009, SNU Biointelligence Laboratory
Demo – How to Start Unzip 2009_AI_Project2.zip file MultiModal Game program Generating program for Image and Sentence (C) 2009, SNU Biointelligence Laboratory

24 (C) 2009, SNU Biointelligence Laboratory
MMG Program Inside the Hypernetwork folder, there are four files. Execution file Configuration file Data file (C) 2009, SNU Biointelligence Laboratory

25 MMG Program(Configuration&run)
Setting parameters using “configure.txt” file And execute! “Hypernetwork.exe” (C) 2009, SNU Biointelligence Laboratory

26 Explnation on Parameters
Text order : number of words in one hyperedge (default: 3, suggested values: 2~3, Integer) Image order : number of pixels in one hyperedge (default: 30, suggested value: 10~50, Integer) Max epoch : number of learning iteration (Integer) Weight Update Rate : Initially, weights are setting to 1. In learning process, weights will be updated according to this value (+0.1 or -0.1) Sampling count : number of generated hyperedges in one data pair (default: 10, suggested values: 5~30, Integer) (C) 2009, SNU Biointelligence Laboratory

27 (C) 2009, SNU Biointelligence Laboratory
Image Generation MMG Program gives binary file, not image file. This process is for converting binary file  image file Write down, binary file name Finally, execute “ImageGenearation.exe” !! (C) 2009, SNU Biointelligence Laboratory

28 (C) 2009, SNU Biointelligence Laboratory
Sentence Generation Text result also not the sentence form. This program is for converting InputFile name And OutputFile name Finally, execute “SentenceGenearation.exe” !! (C) 2009, SNU Biointelligence Laboratory

29 (C) 2009, SNU Biointelligence Laboratory
FAQ Parameters that affects severely to the running time Epoch, sampling count Current program does not allow making new training files Dictionary file is fixed. If you want to, make dictionary file too. If you have any question about the program, visit the office (Tel ) Youngkil, Ko (C) 2009, SNU Biointelligence Laboratory


Download ppt "Artificial Intelligence"

Similar presentations


Ads by Google