Look Over Here: Attention-Directing Composition of Manga Elements Ying Cao Rynson W.H. Lau Antoni B. Chan SIGGRAPH 2014 1.

Slides:



Advertisements
Similar presentations
Ying Cao Antoni B. ChanRynson W.H. Lau City University of Hong Kong.
Advertisements

Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Mixture Models and the EM Algorithm
Dimension reduction (1)
K Means Clustering , Nearest Cluster and Gaussian Mixture
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
A Probabilistic Model for Component-Based Shape Synthesis Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, Vladlen Koltun Stanford University.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov
2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.
1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.
British Museum Library, London Picture Courtesy: flickr.
Visual Recognition Tutorial
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin The Chinese.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Bayes Net Perspectives on Causation and Causal Inference
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
1 Three dimensional mosaics with variable- sized tiles Visual Comput 2008 報告者 : 丁琨桓.
Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Example Clustered Transformations MAP Adaptation Resources: ECE 7000:
Multimodal Interaction Dr. Mike Spann
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Probabilistic Robotics Bayes Filter Implementations.
Attention-Directing Composition of Ying Cao Rynson W.H. Lau Elements Look Over Here: Antoni B. Chan City University of Hong Kong.
Using Inactivity to Detect Unusual behavior Presenter : Siang Wang Advisor : Dr. Yen - Ting Chen Date : Motion and video Computing, WMVC.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Lecture 2: Statistical learning primer for biologists
Inferring High-Level Behavior from Low-Level Sensors Donald J. Patterson, Lin Liao, Dieter Fox, and Henry Kautz.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Gaussian Processes For Regression, Classification, and Prediction.
A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Design and Implementation of Speech Recognition Systems Fall 2014 Ming Li Special topic: the Expectation-Maximization algorithm and GMM Sep Some.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
How fast is a ‘rapid method’?
Statistical Models for Automatic Speech Recognition
Composition The placement or arrangement of visual elements or ingredients in a work of art, as distinct from the subject of a work. The term composition.
Context-based vision system for place and object recognition
CSCI 5822 Probabilistic Models of Human and Machine Learning
Bayesian Models in Machine Learning
Probabilistic Models with Latent Variables
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Overview Proposed Approach Experiments Compositional inference
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
A Switching Observer for Human Perceptual Estimation
A Switching Observer for Human Perceptual Estimation
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
GhostLink: Latent Network Inference for Influence-aware Recommendation
Probabilistic Surrogate Models
Presentation transcript:

Look Over Here: Attention-Directing Composition of Manga Elements Ying Cao Rynson W.H. Lau Antoni B. Chan SIGGRAPH

Outline Introduction Overview Data Acquisition and Preprocessing Probabilistic Graphical Model Learning Interactive Composition Synthesis Evaluation and Results Discussion 2

Introduction Goal 3 1.Rabbit, I came here for gold, 2. and I'm gonna get it! 3. I gotcha, you rabbit! I'll show you! Close-up Fast Long Medium Close-up Medium Big Close-up Medium You can't do this to me! Eureka! Gold at last! Talk

Introduction The especially composition of manga elements. subjects ( ) and balloons( ) Manga artist guides viewer’s eyes through the page via subject and balloon placement. The path guiding the readers through the artworks the underlying artist’s guiding path (AGP) The viewer’s eye-gaze path through the page the actual viewer attention 4

Introduction We introduce a novel probabilistic graphical model for subject-balloon composition. Based on this model, we propose an approach for placing a set of subjects and their balloons on a page. In response to high-level user specification, and evaluate its effectiveness through a series of visual perception studies. 5

Overview 6 Probabilistic Graphical Model Artist’s Guiding Path Composition Viewer Attention Input Storyboard Layout Generate Resulting composition Data AnnotationEye-tracking Data Learn Input Infer

Data Acquisition and Preprocessing To train our probabilistic model, we have collected a data set comprising 80 manga pages from three different series. 7 Shot type→ Motion state→ Balloons→ ↓Subject AnnotationEye movements of viewers

Probabilistic Graphical Model We propose a novel probabilistic graphical model to hierarchically connect artist’s guiding path, composition and viewer attention in a probabilistic network. Abstracts the artist’s guiding path (AGP) as a latent variable in our model. 8 Probabilistic Graphical Model Artist’s Guiding Path Composition Viewer Attention

Probabilistic Graphical Model Our proposed model consists of 6 components, representing different factors that influence the placement of elements on the page. 9

(1)-Model Components and Variables In our model, the page consists of a set of panels. Each panel has subjects, each of which has balloons. 10

(1)-Model Components and Variables Artist’s Guiding Path(AGP)  Underlying AGP (f(t)) and actual AGP (I(t)) are represented as smooth splines over the page.  Uniformly samples control points along the curve length, 11 actual AGP underlying AGP

(1)-Model Components and Variables Panel Properties and Local Composition Model  We consider both semantic (i.e., shot type and motion state) and geometric (i.e., rough shape) properties of the panels. 12

(1)-Model Components and Variables Panel Properties and Local Composition Model  We define as the possible subject locations and sizes according to the local composition in the panel 13

(1)-Model Components and Variables Subject Placement  The actual placement of a subject is a mixture of its local position and an associated point on the global AGP.  We denote the subject’s location and size as. 14

(1)-Model Components and Variables Balloon Placement  The placement of a balloon depends on its subject’s configuration, its size, and reader order, as well as an associated point on the AGP.  We denote the balloon’s position and size as. 15

(1)-Model Components and Variables Viewer Attention Transitions  For each panel, we define a set of binary variables, where indicates that there is a viewer transition between elements and. 16

(1)-Model Components and Variables Complete model by putting the six model components together. 17

(2)- Probability Distributions Each random variable in our model is associated with a conditional probability distribution (CPD),, which represents the probability of observing given its parents. We next describe the CPDs used for each variable in our model. 18

(2)- Probability Distributions Artist’s Guiding Path (f, I).  The two coordinate components of the curve are modeled as two independent Gaussian processes, -, : the squared exponential covariance functions  The actual AGP I is a noisy version of the underlying AGP f, - denotes a multivariate Gaussian distribution of x, with mean µ and covariance Σ. 19

(2)- Probability Distributions 20

(2)- Probability Distributions 21

(2)- Probability Distributions 22

Learning The goal of the offline learning stage is to estimate the parameters θ in the CPDs of all random variables in the probabilistic model, from the training set D. expectation-maximization (EM) algorithm [Bishop 2006] 23 BISHOP, C Pattern Recognition and Machine Learning. Springer.

Interactive Composition Synthesis Generate a composition, subject to user-specified semantics Layout Generation + Composition Synthesis 24 1.Rabbit, I came here for gold, 2. and I'm gonna get it! 3. I gotcha, you rabbit! I'll show you! Input: subject & script Close-up Fast shot type & motion state Talk inter-subject constraint

(1)-Layout Generation 25

(2)-Composition via MAP Inference 26 Configurations of elements Input elements & semantics + Layout Constraints

(2)-Composition via MAP Inference  Constraint-based Likelihood. -where {ρi} are weights controlling importance of different terms. -Our implementation uses ρ1 = ρ2 = 0.3, ρ3 = ρ4 =

(2)-Composition via MAP Inference  Constraint-based Likelihood. 28

(2)-Composition via MAP Inference  Constraint-based Likelihood. 29

Evaluation and Results 30

(1)-Comparison to Heuristic Method Visual Perception Study.  The goal of the visual perception study is to investigate if the participants have a strong preference for our results over those produced by the heuristic methodt[Chun et al. 2006]. 31 CHUN, B., RYU, D., HWANG, W., AND CHO, H An automated procedure for word balloon placement in cinema comics. LNCS 4292, 576–585..

(1)-Comparison to Heuristic Method Visual Perception Study. 32

(1)-Comparison to Heuristic Method Eye-tracking experiment and analysis.  We measure the consistency in both unordered and ordered eye fixations across different viewers. Inlier percent [Judd et al. 2009] Root Mean Squared Distance (RMSD) 33 JUDD, T., EHINGER, K., DURAND, F., AND TORRALBA, A Learning to predict where humans look. In ICCV’09. Inliers Viewer A Saliency Map Viewer B Classification RMSD, Viewer A Viewer B

(1)-Comparison to Heuristic Method Eye-tracking experiment and analysis.  Shows example compositions with eye-tracking data. 34

(2)-Comparison to Manual Method 35 Participant preference votingTime for one composition

(3)-Comparison to Existing Manga Pages 36

(4)-Recovering Artist’s Guiding Path 37

(5)-Limitations Our work has two limitations. 1.Our work assumes that the variations in spatial location and scale of elements are the only factors driving viewer attention. 2.For the panel with more than four subjects, our approach can fail to produce satisfying results automatically. 38

Discussion We have proposed a probabilistic graphical model for representing dependency among the artist’s guiding path, composition and viewer attention. We show that compositions from our approach are more visually appealing and provide a smoother reading experience, as compared to those by a heuristic method. Enable easy and quick creation of attention-directing compositions. Extend to other graphic design tasks. 39

References manga pic 40