Building character animation for intelligent storytelling with the H-Anim standard Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent.

Slides:

Advertisements

Similar presentations

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory

Advertisements

Extraction and Visualisation of Emotion from News Articles Eva Hanser, Paul Mc Kevitt School of Computing & Intelligent Systems Faculty of Computing &

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,

HOMER: A Creative Story Generation System Student: Dimitrios N. Konstantinou Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems.

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents.

MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.

PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.

Statistical NLP: Lecture 3

ENTERFACE’08 Multimodal high-level data integration Project 2 1.

KAIST CS780 Topics in Interactive Computer Graphics : Crowd Simulation A Task Definition Language for Virtual Agents WSCG’03 Spyros Vosinakis, Themis Panayiotopoulos.

NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.

Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

Natural Language Processing AI - Weeks 19 & 20 Natural Language Processing Lee McCluskey, room 2/07

1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.

Sunee Holland University of South Australia School of Computer and Information Science Supervisor: Dr G Stewart Von Itzstein.

Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Cognitive Robots © 2014, SNU CSE Biointelligence Lab.,

CONFUCIUS: An Intelligent MultiMedia Storytelling Interpretation and Presentation System Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing.

© 2004 Team Bondi Pty. Ltd. Interactive Drama – Dynamic Dialogue and Sandbox Worlds / AGDC.

GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.

Animating Virtual Humans in Intelligent Multimedia Storytelling Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Multimodal Communication in the Staging Virtual farm Patrizia Paggio and Bart Jongejan Center for Sprogteknologi MUMIN workshop Helsinki 2002.

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.

Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.

Parser-Driven Games Tool programming © Allan C. Milne Abertay University v

APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.

A Cognitive Substrate for Natural Language Understanding Nick Cassimatis Arthi Murugesan Magdalena Bugajska.

SceneMaker: Multimodal Visualisation of Natural Language Film Scripts Dr. Minhua Eunice Ma School of Computing & Intelligent Systems Faculty of Computing.

Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.

SceneMaker Intelligent Multimodal Visualisation of Natural Language Scripts Eva Hanser Dipl.-Des. (FH), M.Sc. Prof. Paul Mc Kevitt, Dr. Tom Lunney, Dr.

Temporal Relations in Visual Semantics of Verbs Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering.

CONFUCIUS: an Intelligent MultiMedia storytelling interpretation & presentation system Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing.

SceneMaker: Automatic Visualisation of Screenplays School of Computing & Intelligent Systems Faculty of Computing & Engineering University of Ulster, Magee,

Unit-1 Introduction Prepared by: Prof. Harish I Rathod

ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.

NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.

A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture Arno Hartholt (ICT), Thomas Russ (ISI),

October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.

A Multi-agent Approach for the Integration of the Graphical and Intelligent Components of a Virtual Environment Rui Prada INESC-ID.

Introduction to Computational Linguistics

Toward a Unified Scripting Language 1 Toward a Unified Scripting Language : Lessons Learned from Developing CML and AML Soft computing Laboratory Yonsei.

Using RouteGraphs as an Appropriate Data Structure for Navigational Tasks SFB/IQN-Kolloquium Christian Mandel, A1-[RoboMap] Overview Goal scenario.

1 1. Representing and Parameterizing Agent Behaviors Jan Allbeck and Norm Badler 연세대학교 컴퓨터과학과 로봇 공학 특강 학기 유 지 오.

© TMC Computer School HC20203 VRML HIGHER DIPLOMA IN COMPUTING Chapter 2 – Basic VRML.

Chapter 1. Cognitive Systems Introduction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Park, Sae-Rom Lee, Woo-Jin Statistical.

1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.

Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.

Intelligent MultiMedia Storytelling System (IMSS) - Automatic Generation of Animation From Natural Language Input By Eunice Ma Supervisor: Prof. Paul Mc.

Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.

PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.

SceneMaker: Automatic Visualisation of Screenplays Eva Hanser Prof. Paul Mc Kevitt Dr. Tom Lunney Dr. Joan Condell School of Computing & Intelligent Systems.

Automating programming via concept mining, probabilistic reasoning over semantic knowledge base of SE domain by Max Talanov.

NATURAL LANGUAGE PROCESSING

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.

What is Multimedia Anyway? David Millard and Paul Lewis.

Functionality of objects through observation and Interaction Ruzena Bajcsy based on Luca Bogoni’s Ph.D thesis April 2016.

Collision Theory and Logic

MPEG-4 Binary Information for Scenes (BIFS)

Collision Theory and Logic

Statistical NLP: Lecture 3

Natural Language Processing (NLP)

SECOND LANGUAGE LISTENING Comprehension: Process and Pedagogy

Natural Language Processing (NLP)

Artificial Intelligence 2004 Speech & Natural Language Processing

Lecture 3. Virtual Worlds : Representation,Creation and Simulation ( II ) 고려대학교 그래픽스 연구실.

Natural Language Processing (NLP)

Presentation transcript:

Building character animation for intelligent storytelling with the H-Anim standard Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Informatics University of Ulster

EuroGraphics Ireland 29 April 2003  MultiModal interactive storytelling AesopWorld KidsRoom Larsen & Petersen’s Interactive Storytelling Computer games  Virtual humans & embodied agents Jack (University of Pennsylvania) Improv (Perlin & Goldberg, 1996) BEAT (Cassell et al., 2000) SimHuman Gandalf Previous research

EuroGraphics Ireland 29 April 2003  Automatic Text-to-Graphics Systems WordsEye (Coyne & Sproat, 2001) ‘Micons’ and CD-based language animation (Narayanan et al. 1995) Spoken Image (Ó Nualláin & Smith, 1994) & successor SONAS (Kelleher et al. 2000)  Semantic representations Schank’s (1972) Conceptual Dependency (CD) Theory & scripts Jackendoff’s (1990) Lexical Conceptual Structure (LCS) Previous research

EuroGraphics Ireland 29 April 2003 Objectives of CONFUCIUS  To interpret natural language story and movie (drama) script input and to extract conceptual semantics from the natural language  To generate 3D animation and virtual worlds automatically from natural language  To integrate 3D animation with speech and non-speech audio, to form an intelligent multimedia storytelling system for presenting multimodal stories

EuroGraphics Ireland 29 April 2003 CONFUCIUS’ context diagram Story in natural language CONFUCIUS Movie/drama script 3D animation non-speech audio Tailored menu for script input Speech (dialogue) Storywriter /playwright User /story listener

EuroGraphics Ireland 29 April 2003 Architecture of CONFUCIUS 3D authoring tools, existing 3D models & character models visual knowledge (3D graphic library) Prefabricated objects (knowledge base) Script writer Script parser Natural Language Processing Text To Speech Sound effects Animation generation Synchronizing & fusion 3D world with audio in VRML Natural language stories Language knowledge mapping lexicon grammar etc semantic representations visual knowledge

EuroGraphics Ireland 29 April 2003 knowledge base Language knowledge Visual knowledge World knowledge Spatial & quantitative reasoning knowledge Semantic knowledge - lexicons (e.g. WordNet) Syntactic knowledge - grammars Statistical models of language Associations between words Object model (nouns) Functional information Internal coordinate axes (for spatial reasoning) Associations between objects Knowledge base of CONFUCIUS Event model (event verbs, describes the motion of objects/humans)

EuroGraphics Ireland 29 April 2003 Software & Standards  Java: parsing intermediate representation, changing VRML code to add/modify animation, integrating modules  3D graphic modelling Authoring tools Humanoid characters: Character Studio, Internet Character Animator (ICA) Narrator: Microsoft Agent Props & stage: 3D Studio Max Modelling language & standard VRML 97 for modelling the geometric of objects, props and environment Humanoid modelling  MPEG-4 Face and Body Animation (FBA)  Humanoid Animation (H-Anim) specifications  Main problem to solve: defining standards for high-level behaviours of virtual Humans  Natural language processing tools PC-PARSE (morphologic and syntax analysis) WordNet (lexicon, semantic inference)

EuroGraphics Ireland 29 April 2003 Level 1 Of Articulation of H-Anim Joints and segments of LOA1 Though CONFUCIUS adopts Level 1 Of Articulation (LOA1) in its human character animation, its animation script engine adds ROUTEs dynamically based on the h-anim’s joint list and animation keyframe list. As long as the animation keyframes are in conformity with the joints definition in the h-anim file, CONFUCIUS’ animation engine is well adapted for any level of articulation.

EuroGraphics Ireland 29 April 2003 Agents and Avatars— How much autonomy? Autonomy & intelligence: highlow autonomous characters avatarsinterface agents Virtual humans:  Autonomous characters/agents have higher requirements for sensing, memory, reasoning, planning, behaviour control, and even emotional status (a sense- control-action structure)  Avatars are “user-controlled” and hence require fewer autonomous actions. However, basic naïve physics such as collision detection and reaction is still demanded when the user controls an avatar to hit a wall or grasp an object  A virtual character in non-interactive storytelling is somewhere in between an agent and an avatar. Most of its behaviours, emotion, and responses to the changing environment are described in story input characters in non-interactive storytelling

EuroGraphics Ireland 29 April 2003 Semantic representations

EuroGraphics Ireland 29 April 2003  Lexical Visual Semantic Representation (LVSR) is a necessary semantic representation between 3D model information and syntactic information because 3D model differences, although crucial in distinguishing word meanings, are invisible to syntax  LVSR is based on Jackendoff’s LCS and adapts it to the task of language visualization. It enhances LCS by Schank’s scripts  Ontological categories of LVSR: OBJ, HUMAN, EVENT, STATE, PLACE, PATH, and PROPERTY OBJ for props or places (e.g. buildings) HUMAN for either human being or any other articulated animated characters (e.g. animals) as long as their skeleton hierarchy is defined in the graphic library EVENT for actions, movements and manners STATE for static existence PROPERTY for attributes of OBJ/HUMAN Lexical Visual Semantic Representation

EuroGraphics Ireland 29 April 2003 PATH & PLACE predicates We analysed 62 common English prepositions and defined 7 PATH predicates and 11 PLACE predicates for interpreting spatial movement events of OBJ/HUMANs

EuroGraphics Ireland 29 April 2003 Examples of LVSR & animation generation  Manipulating environment & spatial relations Input sentence: John walked towards the house. LVSR: [EVENT walk ([HUMAN john],[PATH toward [OBJ house]])] Output animation Input sentence: Nancy ran across the field. LVSR: [EVENT run ([HUMAN nancy],[PATH via [PLACE on [OBJ field]]])] Output animation  Manipulating objects Input sentence: John lifted his hat. LVSR: [EVENT go ([OBJ hat],[PATH from [PLACE on [OBJ john.head]]])] [EVENT lift ([HUMAN john],[OBJ hat])] Output animation

EuroGraphics Ireland 29 April 2003 Graphics library Simple geometry files geometry & joint hierarchy Files (H-Anim) animation library (key frames) objects/props characters motions instantiation

EuroGraphics Ireland 29 April 2003 Animation generator verb semantic analysis use lexical entries in Lexical Visual Semantics to analyse verb semantics, replace synonyms, spatial reasoning match basic motions in library? motion decomposition animation controller environment placement N Y Syntax tree VRML file of the virtual story world motion instantiation apply scripts LVSR If the event predicate matches basic human motions in animation library Apply spatial info & place OBJ/HUMAN into a specified environment

EuroGraphics Ireland 29 April 2003 Collision detection  Collision detection is a crucial issue for path planning, manoeuvring objects, reactive behaviour, and multiple characters’ activities  VRML provides a built-in collision detection mechanism for the avatar (user), but the mechanism does not apply to intersection between other characters/objects  Collision avoidance algorithms for humanoid bodies: Coarse approximations (e.g. bounding boxes or spheres) Polygon level checks between humans and objects Dynamic LOD checking according to distance to the observer, users’ observation focus, and whether the human is in a crowd, etc.  CONFUCIUS’ animation generator uses bounding cylinders around the human body segments for protagonists A bounding cylinder around the whole human body for minor characters, characters in a crowd, and characters beyond the scope of attention

EuroGraphics Ireland 29 April 2003 Multiple characters’ synchronization & coordination Multiple characters’ activities  A character can start a task when another signals that the situation (pre-conditions) is ready  Characters can communicate with one another  Two or more characters can cooperate in a shared task Multiple characters’ synchronization  Event-driven timing mechanism (VRML provides a utility for event routing (ROUTE node)  Exact time-driven synchronization Nancy was walking along the street. John called her. Nancy stopped and saw John. John walked towards her. They exchanged greetings. The end of the animation john_speech (calling Nancy) triggers: (1) to stop the animation of nancy_walk (2) to start the animation of nancy_gazeWander (searching for who’s calling) (3) to start the animation of john_walk (walking towards Nancy)

EuroGraphics Ireland 29 April 2003 Relation to other work  A general purpose humanoid character animation system  Compared with other related virtual human modelling systems, CONFUCIUS’ character animation focuses on the language-to-humanoid animation process rather than considering human modelling & motion solely  Fully use existing 3D OBJ/HUMAN models, tools and programs, such as the H-anim models Nancy (by C. Ballreich, © Name3D / Yglesias, Wallock, Divekar, Inc.) and Baxter (by C. Babski, © LIG/EPFL), animation keyframe files, and BVH to h- anim keyframe conversion script (by M. Lewis, The Ohio State University)  Adopt current studies in linguistics such as LCS and improve them to adapt the demands of language visualization

EuroGraphics Ireland 29 April 2003 Prospective applications  Children’s education  Multimedia presentation  Movie/drama production  Computer games  Virtual Reality Conclusion & future work CONFUCIUS’ humanoid character animation explores challenging problems in language visualization and automatic animation production:  formalizes meaning of action verbs and spatial prepositions  maps language primitives with visual primitives  a reusable common senses knowledge base for other systems Future work  Deformation for facial expressions  under-specified language input  action composition for simultaneous activities