Look Over Here: Attention-Directing Composition of Manga Elements Ying Cao Rynson W.H. Lau Antoni B. Chan SIGGRAPH
Outline Introduction Overview Data Acquisition and Preprocessing Probabilistic Graphical Model Learning Interactive Composition Synthesis Evaluation and Results Discussion 2
Introduction Goal 3 1.Rabbit, I came here for gold, 2. and I'm gonna get it! 3. I gotcha, you rabbit! I'll show you! Close-up Fast Long Medium Close-up Medium Big Close-up Medium You can't do this to me! Eureka! Gold at last! Talk
Introduction The especially composition of manga elements. subjects ( ) and balloons( ) Manga artist guides viewer’s eyes through the page via subject and balloon placement. The path guiding the readers through the artworks the underlying artist’s guiding path (AGP) The viewer’s eye-gaze path through the page the actual viewer attention 4
Introduction We introduce a novel probabilistic graphical model for subject-balloon composition. Based on this model, we propose an approach for placing a set of subjects and their balloons on a page. In response to high-level user specification, and evaluate its effectiveness through a series of visual perception studies. 5
Overview 6 Probabilistic Graphical Model Artist’s Guiding Path Composition Viewer Attention Input Storyboard Layout Generate Resulting composition Data AnnotationEye-tracking Data Learn Input Infer
Data Acquisition and Preprocessing To train our probabilistic model, we have collected a data set comprising 80 manga pages from three different series. 7 Shot type→ Motion state→ Balloons→ ↓Subject AnnotationEye movements of viewers
Probabilistic Graphical Model We propose a novel probabilistic graphical model to hierarchically connect artist’s guiding path, composition and viewer attention in a probabilistic network. Abstracts the artist’s guiding path (AGP) as a latent variable in our model. 8 Probabilistic Graphical Model Artist’s Guiding Path Composition Viewer Attention
Probabilistic Graphical Model Our proposed model consists of 6 components, representing different factors that influence the placement of elements on the page. 9
(1)-Model Components and Variables In our model, the page consists of a set of panels. Each panel has subjects, each of which has balloons. 10
(1)-Model Components and Variables Artist’s Guiding Path(AGP) Underlying AGP (f(t)) and actual AGP (I(t)) are represented as smooth splines over the page. Uniformly samples control points along the curve length, 11 actual AGP underlying AGP
(1)-Model Components and Variables Panel Properties and Local Composition Model We consider both semantic (i.e., shot type and motion state) and geometric (i.e., rough shape) properties of the panels. 12
(1)-Model Components and Variables Panel Properties and Local Composition Model We define as the possible subject locations and sizes according to the local composition in the panel 13
(1)-Model Components and Variables Subject Placement The actual placement of a subject is a mixture of its local position and an associated point on the global AGP. We denote the subject’s location and size as. 14
(1)-Model Components and Variables Balloon Placement The placement of a balloon depends on its subject’s configuration, its size, and reader order, as well as an associated point on the AGP. We denote the balloon’s position and size as. 15
(1)-Model Components and Variables Viewer Attention Transitions For each panel, we define a set of binary variables, where indicates that there is a viewer transition between elements and. 16
(1)-Model Components and Variables Complete model by putting the six model components together. 17
(2)- Probability Distributions Each random variable in our model is associated with a conditional probability distribution (CPD),, which represents the probability of observing given its parents. We next describe the CPDs used for each variable in our model. 18
(2)- Probability Distributions Artist’s Guiding Path (f, I). The two coordinate components of the curve are modeled as two independent Gaussian processes, -, : the squared exponential covariance functions The actual AGP I is a noisy version of the underlying AGP f, - denotes a multivariate Gaussian distribution of x, with mean µ and covariance Σ. 19
(2)- Probability Distributions 20
(2)- Probability Distributions 21
(2)- Probability Distributions 22
Learning The goal of the offline learning stage is to estimate the parameters θ in the CPDs of all random variables in the probabilistic model, from the training set D. expectation-maximization (EM) algorithm [Bishop 2006] 23 BISHOP, C Pattern Recognition and Machine Learning. Springer.
Interactive Composition Synthesis Generate a composition, subject to user-specified semantics Layout Generation + Composition Synthesis 24 1.Rabbit, I came here for gold, 2. and I'm gonna get it! 3. I gotcha, you rabbit! I'll show you! Input: subject & script Close-up Fast shot type & motion state Talk inter-subject constraint
(1)-Layout Generation 25
(2)-Composition via MAP Inference 26 Configurations of elements Input elements & semantics + Layout Constraints
(2)-Composition via MAP Inference Constraint-based Likelihood. -where {ρi} are weights controlling importance of different terms. -Our implementation uses ρ1 = ρ2 = 0.3, ρ3 = ρ4 =
(2)-Composition via MAP Inference Constraint-based Likelihood. 28
(2)-Composition via MAP Inference Constraint-based Likelihood. 29
Evaluation and Results 30
(1)-Comparison to Heuristic Method Visual Perception Study. The goal of the visual perception study is to investigate if the participants have a strong preference for our results over those produced by the heuristic methodt[Chun et al. 2006]. 31 CHUN, B., RYU, D., HWANG, W., AND CHO, H An automated procedure for word balloon placement in cinema comics. LNCS 4292, 576–585..
(1)-Comparison to Heuristic Method Visual Perception Study. 32
(1)-Comparison to Heuristic Method Eye-tracking experiment and analysis. We measure the consistency in both unordered and ordered eye fixations across different viewers. Inlier percent [Judd et al. 2009] Root Mean Squared Distance (RMSD) 33 JUDD, T., EHINGER, K., DURAND, F., AND TORRALBA, A Learning to predict where humans look. In ICCV’09. Inliers Viewer A Saliency Map Viewer B Classification RMSD, Viewer A Viewer B
(1)-Comparison to Heuristic Method Eye-tracking experiment and analysis. Shows example compositions with eye-tracking data. 34
(2)-Comparison to Manual Method 35 Participant preference votingTime for one composition
(3)-Comparison to Existing Manga Pages 36
(4)-Recovering Artist’s Guiding Path 37
(5)-Limitations Our work has two limitations. 1.Our work assumes that the variations in spatial location and scale of elements are the only factors driving viewer attention. 2.For the panel with more than four subjects, our approach can fail to produce satisfying results automatically. 38
Discussion We have proposed a probabilistic graphical model for representing dependency among the artist’s guiding path, composition and viewer attention. We show that compositions from our approach are more visually appealing and provide a smoother reading experience, as compared to those by a heuristic method. Enable easy and quick creation of attention-directing compositions. Extend to other graphic design tasks. 39
References manga pic 40