Download presentation
Presentation is loading. Please wait.
Published byDavid Berry Modified over 9 years ago
1
AsapRealizer 2.0: The Next Steps in Fluent Behavior Realization for ECAs Herwin van Welbergen, Ramin Yaghoubzadeh, Stefan Kopp Social Cognitive Systems Group
2
Behavior realization for embodied conversational agents (ECAs) Speech, gesture, gaze, facial expression, … Some challenges: Multimodal synchrony E.g. phonological synchrony rule (McNeill 1995) Important for meaning (Habets et al. 2011) Fluent behavior realization Incrementally constructed, low latency Interruptible Adaptability
3
BML
4
The bigger picture: Asap (Kopp et al., 2013) Standardization of components SAIBA BML
5
Fluent behavior in ECAs Traditional turn-based interaction 1. Fully analyze user contribution 2. Plan your own contribution 3. Execute your own contribution ballistically Modern ECA-systems steer away from this paradigm
6
Fluent interaction, related work
7
Fluent interaction, AsapRealizer
8
AsapRealizer Collaboration of HMI, University of Twente and Social Cognitive Systems Group, Bielefeld University Built on >10 years of research on fluent behavior realization Basis: ACE (Kopp and Wachsmuth 2004): incremental utterance construction Elckerlyc (van Welbergen et al., 2010): interactional coordination => AsapRealizer 1.0 (van Welbergen et al., 2012) Specifications for fluent behavior realization in BML and BMLA Implementation mainly in gesture AsapRealizer 2.0 Adds the implementation of fluent behavior realization capabilities for Speech, gaze, facial expression
9
Requirements for fluent behavior realization Incremental plan construction Graceful interruption Adaptations of ongoing behavior Top-down Bottom-up Adjustments to the changing environment
10
Comparison with other realizers: internal design AsapRealizer Other realizers
11
Incremental plan construction Reduce latency, make the ECA more reactive Start behavior early Fold part of the plan construction in realization of previous increments Has biological basis (e.g. Levelt 1989, McNeill 1995) The increments connect smoothly, suprasegmental properties Rhythm, sentence intonation in speech Retraction skipping in gesture But not always: whether or not smooth connections occur has communicative meaning E.g. marking information boundaries (Kendon 1980)
12
Non-incremental specification of gesture-sequences in SmartBody (Xu et al.,2014)
13
Incremental BML specifications
14
Incrementality: suprasegmental effects For gesture: preparation and retraction phases need to be constructed on the fly Subject to change during realization start position of the preparation the hand position at the start/end of the stroke posture ground state at the end of the gesture Similarly for gaze Subject to change during realization position of the gaze target the position of e.g. the head and eyes at the start the gaze ground state at the end of the gaze implemented in novel gaze system based off (Grillon and Thalmann, 2009) with some improvements for biological realism and adaptivity
15
Incremental generation demo Non-incremental Naïve incremental AsapRealizer 2.0 (Using Inprotk_iSS: Baumann & Schlangen 2012) startup delay: 1400 ms startup delay: 480 msstartup delay: 50 ms IVA 2014 is an interdisciplinary annual conference and the main leading scientific forum for presenting research on modelling, developing and evaluating intelligent virtual agents.
16
Incremental generation demo AsapRealizer 2.0 Nonincremental
17
Graceful interruption: specification
18
Graceful interruption More than simply stopping speech/gesture/gaze Speech can be interrupted instantly or at word or phoneme boundaries Gesture/gaze/face Interruption is implemented by returning to a previously specified ‘ground state’ Ground state my be changed using BML 1.0 behaviors (gazeShift, postureShift)
19
Top-down adaptation: specification
20
Top-down adaptation: realization support Requires flexible execution engines Facial animation (Paul 2010): allow on-the- fly intensity changes of morphs, AUs, MPEG4, emotion Body animation: Procedural animation that allows arbitrary parameters and mathematical formulas (van Welbergen et al., 2010) Parameter values may be changed on the fly Captures animation specified in other procedural animation systems (e.g. Ahn et al., 2012) as a subset Supports MURML (Kopp and Wachsmuth, 2004) TTS: inproTK_iss, supports loudness, pitch, tempo
21
Top-down adaptation
22
Alignment to anticipated events: specification
23
Alignment to anticipated external events: Anticipators An anticipator manages a TimePeg on the PegBoard Uses perceptions of the world/interlocutor to continuously update the timing Extrapolate perceptions in prediction of the timing of future events End of interlocutor turn, next beat in music, timing of user movement events in fitness exercises
24
Discussion AsapRealizer’s behavior realization capabilities are (mostly) beyond the state of the art What’s currently missing: model how a virtual human steered by AsapRealizer changes the environment AsapRealizer is eminently suitable for behavior realization in very dynamic contexts Fluent behavior realization is just one aspect of a successful behavior realizer Other realizers may have an edge over AsapRealizer in other aspects, e.g. behavior surface realism This may make them more suitable in contexts where less flexible behavior suffices But: fluent behavior contributes to communicative realism (Blascovic 2002)/interactional adequacy (Bauman & Schlangen 2013)
25
Thanks for your attention Questions? http://asap-project.org
26
Eastereggs
27
Ongoing work AsapRealizer allows the realization of many aspects of fluent behavior It opens up many exciting possibilities for behavior and intent planning Fluent turn-taking Using interruption, keep challenged turn, let yourself be (gracefully) interrupted Each can be realized in widely varying ways (speaking loud, with high pitch, fast,...) Which to select for a desired friendliness, dominance, effectivity,...
28
Adaptation BML blocks specify constraints on timing and shape Generally underspecified, realizers have some realization freedom AsapRealizer allows on the fly changes to shape and timing of behavior Maintaining the time constraints specified in BML
29
Constraint representation
30
Comparison with other realizers: specification of a gesture sequence... Is there anything that, besides what he wants anything you want to work on SmartBody (Xu et al. 2014) Is there anything that, besides what he wants AsapRealizer...
31
Preplanning/activation Preplan a BML block for activation at a later time (or never) Allows behavior realization to start instantly For example in contexts where only a few responses are valid Preplan while listening Activate the ‘proper’ response near instantly once you have the turn
32
Employing fillers (uhm) Keep the turn without having a plan at hand (yet) In some contexts preferred over waiting with speaking until all information is available (Bauman and Schlangen 2013) But may also communicate unintended information (hesitation, uncertainty) Give the behavior planner control over whether or not they should occur
33
Employing fillers (example) The car goes around the corner and turns right.
34
The bigger picture: Asap Standardization of components SAIBA BML
35
BML
36
Realizing BML the traditional way
37
AsapRealizer
38
Constraint representation
39
BML Feedback The Behavior Planner makes changes in the ongoing behavior plan Requires information on its progress from the Realizer Decide whether some message can be considered as ‘arrived’ at the user and act accordingly And on (predicted) timing Decide whether a virtual human should interrupt itself or continue speaking after something urgent comes up, based (among other things) on a prediction of how long it will take to deliver the current content AsapRealizer implements BML 1.0 feedback Progress feedback: informs of timing of delivered behavior Prediction feedback: informs of predicted timing events of ongoing behavior
40
Ongoing work: IU representation of the behavior plan BML blocks incrementally construct and adapt a behavior plan BML Feedback provides updates on the progress of the plan Can we express the plan in an IU structure in the Behavior Planner? Challenging to express Multimodal synchrony/time-constraints Graceful interruption
41
Ongoing work: IU representation of the behavior plan
42
The BML scheduling process Scheduling the multimodal behavior plan
43
The behavior plan BML Specifies behaviors and the constraints between them The multimodal behavior plan Specified on the basis of, e.g., communicative intent Incrementally constructed from BML blocks The motor plan Executes the behavior plan on the embodiment of the virtual human, using sound, joint rotations, FAPs, … Making the motor plan flexible Achieve interpersonal coordination Allow on the fly adjustments of ongoing behavior But do not violate the specified constraints
44
Flexible plan representation The motor plan Describes the low level execution of motor behavior on the embodiment of the virtual human: joint rotations, MPEG4 FAP movement, sound, …etc To be flexible, the plan representation must maintain information about the relation between elements in the plan and the original BML expressions from which they originated be capable of expressing all the constraints be capable of being modified… …in such a way that the (remaining) constraints automatically stay satisfied
45
AsapRealizer’s plan representation Central: the PegBoard Each sync point is assigned to a TimePeg on this board Sync points that are connected by an ‘at’ constraint share the same TimePeg TimePegs can be moved, changing the time of the associated sync(s) TimePegs provide local timing (irt the start of the BML block) Each TimePeg is connected to a BMLBlockPeg BMLBlockPegs provide global timing
46
AsapRealizer’s plan representation OffsetPeg: links to another TimePeg and moves with it, retaining a time offset All ‘at’ constraints can be expressed in Pegs, OffsetPegs en Blockpegs After en before constraints require a specific pair of Pegs Not implemented yet (but: trivial to design and implement)
47
Example: solving BML to TimePegs As you can see on this painting,... <gesture id="point1" start="walk1:relax" type="POINT" target="painting1" stroke="speech1:s1+0.5"/>
48
Managing block timing
49
Architecture Separated parsing, scheduling, execution Central: the PegBoard Engines handle unimodal motor plans Sync points of PlanUnits are connected by TimePegs Anticipators may make time adjustments through TimePegs
50
Architecture The scheduler communicates with a set of engines to set up the multimodal behavior plan It knows for each BML behavior what Engine handles it Engines have a standardized interface with functions to: Add a BML behavior to the motor plan Remove a BML behavior from the motor plan Resolve unknown time constraints on a BML behavior given its known time constraints Check which BML behavior in the motor plan are currently invalid Note that communication with Engines is in terms of BML behaviors
51
Managing adjustments of the plan Interruption of a behavior Will automatically shorten the block’s end Satisfies block append constraint Time adjustment of a sync point Through its associated TimePeg Will also move syncs that are connected to it with ‘at’ constraints Might invalidate other constraints Each Engine has functionality to check for this Solution: drop behavior, drop constraint,.. And communicate back to the behavior planner
52
ECA ‘ground’ state Gaze: default gaze target Face: default/current ‘neutral’ facial expression Posture: current rest posture Locomotion: position Can be set through BML with e.g. gazeShift, faceShift, postureShift Gaze/face/gesture behavior is layered on top of these
53
Comparison with other realizers EMBOTGretaSMBACEElck.AsapR 1.0 AsapR 2.0 Incrementality2/101/102/104/10 5/109/10 Graceful interruption 0/4 1/40/41/42/44/4 Top down adaptation 2/50/52/5 4/5 5/5 Bottom up adaptation 0/3 2/30/31/32/3 Interacting with a changing world 0/7 6/72/73/7 Total4/291/2911/2910/2 9 12/2915/2923/29
54
Incremental generation: implementation Incremental realization entails more than simply concatenating audio files and animations Implement suprasegmental effects Inprotk_iSS (Baumann & Schlangen 2012) Automatic gesture retraction skipping
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.