Collection and Analysis of Multimodal Interaction in Direction Giving Dialogues Seikei University Takeo TsukamotoYumi Muroya Masashi Okamoto Yukiko Nakano Japan Towards an Automatic Gesture Selection Mechanism for Metaverse Avatars
Overview Introduction Research Goal and Questions Approach Data Collection Experiment Analysis Conclusion and Future Work
Introduction Online 3D virtual worlds based on Metaverse application is growing steadily in popularity Ex. : Second Life (SL) ⇒ The communication method is limited to: Online chat with speech balloons Manual gesture generation Hello
Introduction(Cont.) Human face-to-face communication is largely dependent on non-verbal behaviors Ex. direction giving dialogues Many spatial gestures are used in order to illustrate directions and physical relationships of buildings and landmarks How can we implement natural non-verbal behaviors into Metaverse application ?
Research Goal and Questions Goal Establish natural communication between avatars in Metaverse based on human face-to- face communication Research Questions Automation : gesture selection How to automatically generate proper gestures? Comprehensibility : gesture display How to intelligibly display gestures to interlocutor?
Previous work An automatic gesture selection mechanism for Japanese chat texts in Second Life [Tsukamoto,2010] /2 you keep going straight in this road, then you will be able to find a house having a round window on your left./
Proxemics Proxemics is important to implement comprehensible gestures in Metaverse Previous work doesn’t consider proxemics ⇒ There are some cases when avatar’s gesture becomes unintelligible to the others
Approach Conduct an experiment to collect human gestures in direction giving dialogues Collect participant’s verbal and non-verbal data Analyze the relationship between gestures and proxemics
Data Collection Experiment Direction Giver (DG) Know the way to any place on campus of Seikei Univ. Direction Receiver (DR) Know nothing about the campus of Seikei Univ. Experimental Procedure The DR asks a way to a specific building The DG explains how to get to the building DG DR
Experimental Instruction Direction Receiver Instructed to completely understand the way to the goal through a conversation with the DG Direction Giver I nstructed to confirm that the DR understood the direction correctly after the explanation was finished
Experimental Materials Each pair recorded a conversation for each goal place
Experimental Equipments Right armAbdominal HeadShoulder Headset microphone Equipments Experimental Video
Collected Data Video Data Transcription of Utterances Motion Capture Data
Analysis Investigated DG’s gesture distribution with respect to proxemics Analyzed 30 dialogues collected from 10 pairs Analysis was focused on the movements of DG’s right arm during gesturing
Automatic Gesture Annotation Extracted features Movement of position(x, y, z) Rotation(x, y, z) Relative position of the right arm to shoulder(x, y, z) Distance between right arm and shoulder Binary judge Gesturing / Not gesturing It is very time consuming to manually annotate nonverbal behaviors Automatically annotated the gesture occurrence More than 77% of the gestures are right arm gestures Built a decision tree that identified right arm gestures Weka J48 was used for the decision tree learning
Automatic Gesture Annotation (Cont.) As the result of 10-fold cross validation, the accuracy is 97.5% Accurate enough for automatic annotation Example of automatic annotation
Gesture Display Space Defined as the overlap among the DG’s front area, the DR’s front area, and the DR’s front field of vision DGDG DR Direction Receiver Direction Giver Gesture Display Space Center Distance of DG from the center Distance of DR from the center DR’s body Direction vector DG’s body Direction vector DR’s front field of Vision
CategoryConditions Normal(12/30) 450mm ≦ Distance Both-center ≦ 950mm Close_to_DG(4/30) Distance DG-center ≦ 450mm 450mm ≦ Distance DR-center ≦ 950mm Close_to_DR(8/30) Distance DR-center ≦ 450mm 450mm ≦ Distance DG-center ≦ 950mm Close_to_Both(2/30) Distance Both-center ≦ 450mm Far_from_Both(4/30) 950mm ≦ Distance DG-center or 950mm ≦ Distance DR-center Define 450mm to 950mm as the standard distance from the center of the gesture display space Human arm length is 60cm to 80cm, by adding 15cm margin Categories of Proxemics
Analysis : Relationship between Proxemics and Gesture Distribution Analyze the distribution of gestures by plotting the DG’s right arm position Normal Close_to_DGClose_to_DRClose_to_Both Similar Wider Smaller
Analysis : Relationship between Proxemics and Gesture Distribution(Cont.) Close_to_Both < Normal = Close_to_DG < Close_to_Both
Applying the Proxemics Model Create avatar gestures based on our proxemics model To test whether the findings are applicable Close_to_DG Close_to_DR
Conclusion Conducted an experiment to collect human gestures in direction giving dialogues Investigated the relationship between the proxemics and the gesture distribution Proposed five types of proxemics characterized by the distance from the gesture display space Found that the gesture distribution range was different depending on the proxemics of the participants
Future Work Establish a computational model of determining gesture direction Examine the effectiveness of the model whether the users perceive the avatar’s gestures being appropriate and informative
Thank you for your attention
Related work [Breitfuss, 2008] Built a system that automatically adds gestural behavior and eye gaze Based on linguistic and contextual information of input text [Tepper, 2004] Proposed a method for generating novel iconic gestures Used spatial information about locations and shape of landmarks to represent concept of words From a set of parameters, iconic gestures are generated without relying on a lexicon of gesture shapes [Bergmann, 2009] Represented individual variation of gesture shapes using Bayesian network Built an extensive corpus of multimodal behaviors in direction-giving and landmark description task