Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier.

Slides:



Advertisements
Similar presentations
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
Advertisements

Overview Environment for Internet database connectivity
Università della Calabria A Software Architecture for the m-Learning in Instrumentation and Measurement P. Daponte, S. Rapuano Dept. of Engineering, University.
H E L S I N K I U N I V E R S I T Y O F T E C H N O L O G Y G O p r o j e c t : S e r v i c e A r c h i t e c t u r e f o r t h e N o m a d i c I n t e.
CGI vs Servlets Raghu Havaldar Dept of Computer Science Iowa State University.
CSE Tier Architectures (or 3-Tier Applications) Adapted from Chuck Cusack’s Notes.
Brian Alderman | MCT, CEO / Founder of MicroTechPoint Pete Harris | Microsoft Senior Content Publisher.
G O B E Y O N D C O N V E N T I O N WORF: Developing DB2 UDB based Web Services on a Websphere Application Server Kris Van Thillo, ABIS Training & Consulting.
© 2007 IBM Corporation IBM Emerging Technologies Enabling an Accessible Web 2.0 Becky Gibson Web Accessibility Architect.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
1 A Web-Based Integral Evaluator: A Demonstration of the Successful Integration of WebEQ, Maple, and Java Wanda M. Kunkle Department of Mathematics & Computer.
Using Tweek to Create Graphical User Interfaces in Virtual Reality Patrick Hartling IEEE VR 2003.
Interpret Application Specifications
Graphical User Interfaces in Virtual Reality Patrick Hartling Virtual Reality Applications Center IEEE VR 2002.
HUMANOID ANIMATION DRIVEN BY HUMAN VOICE Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy.
Lecture The Client/Server Database Environment
Query Processing in Mobile Databases
The Client/Server Database Environment
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
Web-based Software Development - An introduction.
Standardize on Team Foundation Server across the enterprise with Teamprise Corey Steffen General Manager
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Advanced Java New York University School of Continuing and Professional Studies.
What is Workflow?  Workflow or Business Process Management (BPM) consists of Processes, States and Actions.  A Process (e.g. Customer Order fulfillment)
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
CIS 375—Web App Dev II Microsoft’s.NET. 2 Introduction to.NET Steve Ballmer (January 2000): Steve Ballmer "Delivering an Internet-based platform of Next.
Software Construction Lecture 10 Frameworks
Postacademic Interuniversity Course in Information Technology – Module C1p1 Contents Data Communications Applications –File & print serving –Mail –Domain.
MIS3300_Team8 Service Aron Allen Angela Chong Cameron Sutherland Edment Thai Nakyung Kim.
Why Java? A brief introduction to Java and its features Prepared by Mithat Konar.
Fundamentals of Database Chapter 7 Database Technologies.
Microsoft Application Virtualization 5.0: Introduction Mohnish Chaturvedi & Ian Bartlett Premier Field Engineer WCL312.
Feasibility Study.
Jan Hatje, DESY CSS ITER March 2009: Technology and Interfaces XFEL The European X-Ray Laser Project X-Ray Free-Electron Laser 1 CSS – Control.
Remote Access Using Citrix Presentation Server December 6, 2006 Matthew Granger IT665.
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
Integrating Active Tangible Devices with a Synthetic Environment for Collaborative Engineering Sandy Ressler Brian Antonishek Qiming Wang Afzal Godil National.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Voice-based generic UPnP Control Point Andreas BobekUniversity of Rostock Faculty of Computer Science and Electrical Engineering Andreas Bobek, Hendrik.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
ASP (Active Server Pages) by Bülent & Resul. Presentation Outline Introduction What is an ASP file? How does ASP work? What can ASP do? Differences Between.
Controlling Computer Using Speech Recognition (CCSR) Creative Masters Group Supervisor : Dr: Mounira Taileb.
Understanding StarTeam Enterprise Advantage Course #4124.
An Introduction to Web Services Web Services using Java / Session 1 / 2 of 21 Objectives Discuss distributed computing Explain web services and their.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
WEB SERVER SOFTWARE FEATURE SETS
ICEE Internship International Center for Engineering Education Project: Natural Language Interaction with a Construction Estimating Virtual Reality Environment.
Jan Hatje, DESY CSS GSI Feb. 2009: Technology and Interfaces XFEL The European X-Ray Laser Project X-Ray Free-Electron Laser 1 CSS – Control.
ASP. ASP is a powerful tool for making dynamic and interactive Web pages An ASP file can contain text, HTML tags and scripts. Scripts in an ASP file are.
Web Services An Introduction Copyright © Curt Hill.
ELib Technical Issues Concertation Day: Mobile Code and VRML Brian Kelly UK Web Focus UKOLN, University of Bath, Bath
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
SoftwareServant Pty Ltd 2009 SoftwareServant ® Using the Specification-Only Method.
Windows Azure poDRw_Xi3Aw.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science.
BOPS – Biometric Open Protocol Standard Emilio J. Sanchez-Sierra.
3-Tier Architectures (or 3-Tier Applications)
Web-based Software Development - An introduction
The Client/Server Database Environment
Chapter 9: The Client/Server Database Environment
Introduction to JSP Liu Haibin 12/09/2018.
Developing an Android application for
UmbrellaDB v0.5 Project Report #3
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier

 M.Cernak, A.Sannier,Technical Report, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June  Wauchope, K., S. Everett, D. Tate, T. Maney, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May 14-17, 2003, pp References This report discuss 2 implementations of Speech Interface to Virtual Reality Applications.

Agenda  Introduction  Paper I  Paper II  Conclusion  System design Discussion

Introduction  Both papers are newly published.(2002,2003)  These 2 papers address technical details of Speech-VR integration.\  The 2 nd paper take more modern approach.  Both of them use similar architecture.(and are also similar to ours!) Ex:Choosing VRML + Java Speech API platform and encountered serveral difficult problems such as java security constraint and were force to use a “brwoser as an application ” instead of “browser as an applet”

Paper I  M.Cernak, A.Sannier,Technical Report, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.

Purposes of this paper  Describe an approach to control VR applications using multimodal command speech interface (CSI)based on dialog modeling.  Used to imporve the usability of VRAC’s C6. VRAC : Virtual Reality Applications Center C6 is a Virtual Reality System developed by VRAC.

Multimodal Interaction  U :MoleBio  S :Yes  U :(Targeting the atom 512 by mouse)  U :Go There !  S :OK (goto Atom number 512 ). U: User, S: System Command Addressing,used to trigger system start to record user’s voice for recognition.

System Architecture Dialog Management and Speech facilitiesVR System

System Architecture  VR : VRAC’s C6  TTS : Festival  SR : CSLU Toolkit  Platform : Windows OS on PII 400

Three Main Components(1)  Speech Synthesis (TTS) : Festival.

Three Main Components(2)  CSLU Toolkit :Dialog Modeling, Speech Recognition and Nature Language Processing.  CSLU was implemented in C and Tcl/tk, developed by OGI (Oregon Graduate Institute ) CSLU (Center of Spoken Language Understanding)

Three Main Components(3)  Communication Bridge to VR application.  To Integrate CSLU(Speech) and C6(VR).

How to Integrate CSLU and C6  Initial Attempt : CORBA C6 support CORBA. Try to use “Combat” as tcl extension as CORBA Client but failed. Try to use “Tcl Blend”: -Tck->Java->CORBA->C6 (efficient problems) Result : use TCP socket.

Natural Language Processing  Instead of using standard JSGF, the authors use a custom grammar and wrote a specific parser to evaluate it.  Very similar to JSGF.  We will not discuss the custom grammar in detail here.

SCI Test Environment  A RAD (GUI) tool that help developers to quickly build the dialog flow.

Paper I Conclusion  Major advantage of this system is quick deployment.  The problematic area is the Speech Recognition Accuracy(provided by CSLU) was poor.  US Navy also developed a Speech Inteface to VR System, they will imporved the interaction with VR in terms of their method.

Future Work  Change TTS and SR to IBM ViaVoice. Support JSAPI(Java Speech API) Java is easier to communicate with C6 via CORBA.

Paper II  Wauchope, K., S. Everett, D. Tate, T. Maney, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May , 2003, pp

Introduction  This paper intruduce 2 systems which help newly-aboard crews of US Navy ships to be familiar with their environment quickly. User : Tell me where is Rom 101 !

Motivation  Architects of US Navy Ships heavily use CAD tools to design ship models.  CAD file can be transferred to 3D model format with little effort.  Accroding to author’s previous research,this Virtual Envirionment did shorten crews’ learning time.

Systems introduced  2 Systems MSFT(Multimodal Ship Familiarization Tool) ISFS(Interactive Ship Familiarization System)  ISFS is a recent transition fo MSFT.

System Architecture:MSFT Run as different process

MSFT  VE veiwer component and speech interface run as two separate processes.  Speech interface : using a total IBM solution : ViaVoice. IBM’s SMAPI. IBM’s SRCL grammar. Platform : PIII 500MHz

ISFS  A recent transistion of MSFT.  Using VRML as 3D modeling language.  Using JSAPI as interface to speech engine. ViaVoice totally support JSAPI. VRML support Java as a scripting language  Other structure is identical to MSFT system. Platform : Xeon 2.0GHz ->Need more computing power!

Why Chose to Use Standalone VRML Brwoser?  Security Limitations. (detail will be discussed later)  VM Limitations. (detail will be discussed later)  Provide opportunities to customize interface to VRML browser. In my personal experience,system usually become unstable when speech engine work with VRML Plug- in via EAI’s Java interface.

Security Limitations  JRE imposes security limitations on Java Applets.  JSAPI was unable to establish a connection with speech engine unless we explicitly reconfig the security settings.

Limited VM  Most VRML Browser ‘s EAI were implemented using ActiveX thus only support Microsoft’s old VM which dosen’t support most modern functions of Java. Ex:This may force us to use Java AWT instead of swing which provide better GUI.

Providing GUI as VUI Fallback  GUI provides a fallback in case the speech recognizer is having trouble accurately transcribing the user’s voice.  GUI is adjusted dynamically to provide one-to-one correspondence to VUI.

Paper 2 Conclusion  The Speech Interface is needed because GUI and VE Viewer both rely on direct manipulation and keep our hand too busy.  As HCI become increasingly multimodel,care must be taken to integrate in natural manner.

Future Work  VRML is more close to Object –oriented and tree-structured.  It is hard to represent them in RDBMS.  Must find some way to store model data easily and efficiently. Personal thought : Using XML Database.

Switchable! Discussions

Q & A