Download presentation
Presentation is loading. Please wait.
Published byRussell Norton Modified over 9 years ago
2
Semantic representation of events in 3D animation Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Informatics University of Ulster, Northern Ireland
3
Homer (story generation) CONFUCIUS (story interpretation & presentation) Seanchaí user input (text: stories, play/movie scripts) multimodal presentation natural language stories Seancha í : an Intelligent MultiMedia storyteller
4
CONFUCIUS: story interpretation & presentation Story in natural language CONFUCIUS Movie/drama script 3D animation non-speech audio Tailored menu for script input Speech (dialogue) Storywriter /playwright User /story listener
5
Architecture of CONFUCIUS 3D authoring tools visual knowledge (3D graphic library) Prefabricated objects (knowledge base) Script writer Script parser Natural Language Processing Text To Speech Sound effects Animation generation Synchronizing & fusion 3D world with audio in VRML Natural language stories Language knowledge mapping lexicon grammar etc semantic representations visual knowledge
6
Semantic Representation Languages Sentence level semantics FOPC (First Order Predicate Calculus) Semantic networks Conceptual Dependency (CD) (Schank 1973) Primitives and scripts Frame-based representations (Minsky 1975) Verb Semantics event-logic truth conditions (Siskind 1995) x-schemas with f-structures (Bailey et al. 1997)
7
MultiModal semantic representation Multimodal semantics Language modalityVisual modality Non-speech audio modality Media-independent representation Visual media-dependent representation Intermediate level High-level multimodal semantic representation: XML-based/frame-based Audio media-dependent representation
8
knowledge base Language knowledge Visual knowledge World knowledge Spatial & qualitative reasoning knowledge Semantic knowledge - lexicons (eg. WordNet) Syntactic knowledge - grammars Statistical models of language Associations between words Object model (nouns) Functional information Internal coordinate axes (for spatial reasoning) Associations between objects Knowledge base of CONFUCIUS Event model (event verbs, describes the motion of objects)
9
Categories of events Atomic entities Change physical location such as position and orientation, e.g. “bounce”, “turn” Change intrinsic attributes such as shape, size, color, and texture, e.g. “bend”, and even visibility, e.g. “disappear”, “fade” (in/out) Non-atomic entities Non-character events Two or more individual objects fuse together, e.g. “ melt ” (in) One object divides into two or more individual parts, e.g. “ break ” (into pieces) Change sub-components (their position, size, color), e.g. “ blossom ” Environment events (weather verbs), e.g. “ snow ”, “ rain ” Character events Action verbs Intransitive verbs Transitive verbs Non-action verbs (stative, emotion, possession, mental activities, cognition & perception) Idioms & metaphor verbs
10
Categories of action verbs Intransitive verbs Biped kinematics, e.g. “walk”, “swim”, & other motion models like “fly” Face expressions, e.g. “laugh”, “anger” Lip movement, e.g. “speak”, “say” Transitive verbs single object, e.g. “throw”, “push”, “kick” multiple objects direct and indirect objects, e.g. “ give ”, “ pass ”, “ show ” indirect object & the tool used to perform the action, e.g. “ cut ”, “ hammer ” involve speech modality
11
Basic predicate-arguments 1) move(obj, xInc, yInc, zInc) 2) moveTo(obj, loc) 3) moveToward(obj,loc,displacement) 4) rotate(obj,xAngle,yAngle,zAngle) 5) faceTo(obj1, obj2) 6) alignMiddle(obj1, obj2, axis) 7) alignMax(obj1, obj2, axis) 8) alignMin(obj1, obj2, axis) 9) alignTouch(obj1, obj2, axis) 10) touch(obj1, obj2, axis) 11) scale(obj, rate) 12) squash(obj, rate, axis) 13) group(x, [y|_], newObj) 14) ungroup(xyList, x, yList)
12
3 rd level 2 nd level Atomic level moveToward(), alignMiddle(),alignTouch(), alignMax(), alignMin(), faceTo() move(), moveTo(), rotate(), scale(), squash() touch() Hierarchical structure of predicates
13
Front viewTop view before after x y z obj1 obj2 obj1 obj2 obj1 obj2 obj1 obj2 obj1 obj2 obj1 obj2 x z y obj1 obj2 obj1 obj2 obj1 is in the front obj2 is on the top touch(obj1, obj2, x):- alignMiddle(obj1,obj2,y), alignMiddle(obj1,obj2,z), alignTouch(obj1,obj2,x). touch(obj1, obj2, y):- alignMiddle(obj1,obj2,z), alignMiddle(obj1,obj2,x), alignTouch(obj1,obj2,y). touch(obj1, obj2, z):- alignMiddle(obj1,obj2,x), alignMiddle(obj1,obj2,y), alignTouch(obj1,obj2,z).
14
Decomposite predicate-argument model -- an example: “call” First Level call(a):- type(a, Person), type(tel, Telephone), pickup(a, tel.receiver,a.leftEar), dial(a, tel.keypad), speak(a, tel.receiver), putdown(a, tel.receiver, tel.set). Second Level pickup(x,obj,dest):- type(x, Person), moveToward(x.leftHand,location(obj),location(obj)-location(x)-5), touch(x.leftHand, obj, axis), group(x.leftHand, obj, xHandObj), moveToward(xHandObj, dest, _). putdown(x, obj, dest):- moveTo(x.leftHand, dest), ungroup(x, obj, x1), type(x1, Person).
15
Visual definition & word sense verbword sensevisual definition entry one many mapping word sense -- minimal complete unit of meaning in the language modality visual definition entry -- minimal complete unit of meaning in the visual modality polysemy synonymy Example: “close” (a door) 1.a normal door (rotation on y axis) 2.a sliding door (moving on x axis) 3.a rolling shutter door (a combination of rotation on x axis and moving on y axis) many
16
Troponyms & verbs derived from adjectives/nouns troponym elaborates the manners of a base verb (Fellbaum 1998) examples: “trot”-“walk” (fast), “gulp”-“eat” (quickly) base verb + adverb present the base verb + modify the manner (speed, the agent’s state, duration of the activity, iteration, etc.) Verbs derived from adjectives or nouns change objects’ properties (size, color, shape) or the world state verbs with affixes such as –en, -ify, or –ize, e.g. “lengthen” using predicates scale(), squash() or changing the corresponding property fields of the object in VRML
17
Representing active & passive voice active and passive voice converse verb pairs such as “give/take”, “buy/sell”, “lend/borrow” same activity from different point of view use of VRML Viewpoint node
18
Implementation: semantics VRML bounce(obj):- move(obj, 0, 20, 0), move(obj, 0, -20, 0). (a) visual definition of “bounce ” DEF ball Transform { translation 0 0 0 children [ DEF ball-TIMER TimeSensor { loop TRUE cycleInterval 0.5 }, DEF ball-POS-INTERP PositionInterpolator { key [0, 0.5, 1 ] keyValue [0 0 0, 0 20 0, 0 0 0 ] }, Shape { appearance Appearance { material Material {} } geometry Sphere { radius 5 } }] ROUTE ball-TIMER.fraction_changed TO ball-POS-INTERP.set_fraction ROUTE ball-POS-INTERP.value_changed TO ball.set_translation } (c) Output VRML code of a bouncing ball Example: “A ball is bouncing” DEF ball Transform { translation 0 0 0 children [ Shape { appearance Appearance{ material Material{} } geometry Sphere { radius 5 } ] } (b) VRML code of a static ball
19
Semantic decomposition previous decomposite methologies (e.g. Schank’s CD analysis) basic predicates “move”, “go”, “change” pros and cons generative and interpretative facilities (Jackendoff, 1972) inadequate to capture the creative aspect of meaning comparison aimed at presentation purposes for visual modalities no emphasis on atomic predicates Relation to previous work
20
scriptsExtended predicate-argument representation rob(person, place):- obtain(person, gun), go(person, place), holdUp(person, place), escape(person, place). call(a):- pickup(a,tel.receiver,a.leftEar), dial(a, tel.keypad), speak(a, tel.receiver), putdown(a,tel.receiver, tel.set). orderFood(person):- ATRANS(waiter,person,menu), MTRANS(menu, person), MBUILD(person, choice), TRANS(person,waiter,choice). pickup(x,obj,dest):- moveToward(x.leftHand,location(obj), location(obj)-location(x)-5), touch(x.leftHand, obj, axis), group(x.leftHand, obj, xhandObj), moveToward(xhandObj, dest, _). Event levelsExample verbs Routine eventsRob, cook, interview, eatOut Simple action verbsjump, lift, give, walk, push Primitive actionsATRANS, PTRANS, MOVE (Script) move, rotate (Extended predicate-argument representation) high level low level
21
Conclusion & future work Conclusion formalizes meaning of action verbs implement in Java & VRML reusable in other systems Future work inadequate vagueness problem in language visualisation (underspecification) temporal relations between sub-activities representing non-action verbs & adjectives using other modalities (e.g. speech/audio) to aid event representation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.