Presentation is loading. Please wait.

Presentation is loading. Please wait.

Non-Traditional Interfaces CSCI324, IACT403, IACT 931, MCS9324 Human Computer Interfaces.

Similar presentations


Presentation on theme: "Non-Traditional Interfaces CSCI324, IACT403, IACT 931, MCS9324 Human Computer Interfaces."— Presentation transcript:

1 Non-Traditional Interfaces CSCI324, IACT403, IACT 931, MCS9324 Human Computer Interfaces

2 Multi-Sensory Systems More than one sensory channel in interaction More than one sensory channel in interaction e.g. sounds, text, hypertext, animation, video, gestures, vision e.g. sounds, text, hypertext, animation, video, gestures, vision Used in a range of applications: Used in a range of applications: particularly good for users with special needs, and virtual reality particularly good for users with special needs, and virtual reality Will briefly cover Will briefly cover general terminology general terminology speech speech non-speech sounds non-speech sounds handwriting handwriting text and hypertext text and hypertext animation and video animation and video

3 Usable Senses The five senses (sight, sound, touch, taste, smell) are used by us every day The five senses (sight, sound, touch, taste, smell) are used by us every day Each is important on its own Each is important on its own Together, they provide a fuller interaction with the natural world Together, they provide a fuller interaction with the natural world Computers rarely offer such a rich interaction Computers rarely offer such a rich interaction

4 Usable Senses Can we use all the available senses? Can we use all the available senses? Ideally, yes Ideally, yes Practically – no Practically – no

5 Usable Senses We can use We can use Sight Sight Sound Sound Touch (sometimes ~ eg: haptic feedback) Touch (sometimes ~ eg: haptic feedback) We cannot (yet) use We cannot (yet) use Taste Taste Smell Smell

6 Multi-modal Versus Multi-media Multi-modal systems Multi-modal systems use more than one sense (or mode ) of interaction use more than one sense (or mode ) of interaction e.g. visual and aural senses: e.g. visual and aural senses: a text processor may speak the words as well as echoing them to the screen a text processor may speak the words as well as echoing them to the screen Multi-media systems Multi-media systems use a number of different media to communicate information use a number of different media to communicate information

7 Multi-modal Versus Multi-media e.g. a computer-based teaching system: e.g. a computer-based teaching system: may use video, animation, text and still images may use video, animation, text and still images different media all using the visual mode of interaction different media all using the visual mode of interaction may also use sounds, both speech and non-speech: may also use sounds, both speech and non-speech: two more media, now using a different mode two more media, now using a different mode

8 Speech Human beings have a great and natural mastery of speech Human beings have a great and natural mastery of speech makes it difficult to appreciate the complexities, but makes it difficult to appreciate the complexities, but it’s an easy medium for communication it’s an easy medium for communication

9 Structure of Speech Phonemes Phonemes 40 of them in English: basic atomic units 40 of them in English: basic atomic units 24 consonants, 16 vowel 24 consonants, 16 vowel Sound slightly different depending on the context they are in Sound slightly different depending on the context they are in Emphasis, stress, pause, pitch Emphasis, stress, pause, pitch Pit Pit Spit Spit Tip Tip Say each of these with your hand just in front of your mouth Say each of these with your hand just in front of your mouth

10 Structure of Speech Phonemes Phonemes 40 of them in English: basic atomic units 40 of them in English: basic atomic units 24 consonants, 16 vowel 24 consonants, 16 vowel Sound slightly different depending on the context they are in Sound slightly different depending on the context they are in Emphasis, stress, pause, pitch Emphasis, stress, pause, pitch Pit: aspirated Pit: aspirated Spit: regular Spit: regular Tip: unreleased Tip: unreleased Say each of these with your hand just in front of your mouth Say each of these with your hand just in front of your mouth

11 Structure of Speech Allophones Allophones all the sounds in the language all the sounds in the language [p h ] and [p] are allophones for /p/ [p h ] and [p] are allophones for /p/ between 120 and 130 of them between 120 and 130 of them these are formed into these are formed into

12 Structure of Speech Morphemes Morphemes smallest unit of language that has meaning smallest unit of language that has meaning Town is a free morpheme, it can be used alone or in combination Town is a free morpheme, it can be used alone or in combination The word "unbelievable" has three morphemes "un-", (negatory) a bound morpheme, "-believe-" a free morpheme, and "-able" The word "unbelievable" has three morphemes "un-", (negatory) a bound morpheme, "-believe-" a free morpheme, and "-able"

13 Structure of Speech Other terminology: Other terminology: Prosody Prosody alteration in tone and quality alteration in tone and quality variations in emphasis, stress, pauses and pitch impart more meaning to sentences variations in emphasis, stress, pauses and pitch impart more meaning to sentences

14 Structure of Speech Co-Articulation Co-Articulation the effect of context on the sound the effect of context on the sound Phoneme is altered by the prior phoneme due to mechanics of human vocal system Phoneme is altered by the prior phoneme due to mechanics of human vocal system co-articulation transforms the phonemes into allophones co-articulation transforms the phonemes into allophones Syntax - structure of sentences Syntax - structure of sentences Semantics - meaning of sentences Semantics - meaning of sentences

15 Structure of Speech Different people speak differently: Different people speak differently: accent, intonation, stress, idiom, volume and so on can all vary accent, intonation, stress, idiom, volume and so on can all vary The syntax of semantically similar sentences may vary. The syntax of semantically similar sentences may vary. Background noises can interfere. Background noises can interfere.

16 Structure of Speech People often “ummm.....” and “errr.....” People often “ummm.....” and “errr.....” Words not enough - semantics needed as well Words not enough - semantics needed as well requires intelligence to understand a sentence requires intelligence to understand a sentence context of the utterance often has to be known context of the utterance often has to be known also information about the subject and speaker. also information about the subject and speaker. example: even if “Errr.... I, um, don’t like this” is recognised, it is a fairly useless piece of information on it’s own example: even if “Errr.... I, um, don’t like this” is recognised, it is a fairly useless piece of information on it’s own

17 The Phonetic Typewriter Developed for Finnish (a phonetic language, written as it is said). Developed for Finnish (a phonetic language, written as it is said). Trained on one speaker, will generalise to others. Trained on one speaker, will generalise to others.

18 The Phonetic Typewriter A neural network is trained to cluster together similar sounds, which are then labelled with the corresponding character. A neural network is trained to cluster together similar sounds, which are then labelled with the corresponding character. When recognising speech, the sounds uttered are allocated to the closest corresponding output, and the character for that output is printed. When recognising speech, the sounds uttered are allocated to the closest corresponding output, and the character for that output is printed. requires large dictionary of minor variations to correct general mechanism requires large dictionary of minor variations to correct general mechanism noticeably poorer performance on speakers it has not been trained on noticeably poorer performance on speakers it has not been trained on

19 The Phonetic Typewriter (cont’d)

20 Speech Recognition: currently useful? Single user, limited vocabulary systems widely available e.g. computer dictation Single user, limited vocabulary systems widely available e.g. computer dictation Open use, limited vocabulary systems can work satisfactorily e.g. some voice activated telephone systems Open use, limited vocabulary systems can work satisfactorily e.g. some voice activated telephone systems No general user, wide vocabulary systems are commercially successful, yet … but some are bundled in larger packages No general user, wide vocabulary systems are commercially successful, yet … but some are bundled in larger packages eg: technology from “DragonDictate” in MS-OfficeXP eg: technology from “DragonDictate” in MS-OfficeXP Large potential, however Large potential, however when users hands are already occupied e.g. driving, manufacturing when users hands are already occupied e.g. driving, manufacturing for users with physical disabilities for users with physical disabilities lightweight, mobile devices lightweight, mobile devices

21 Speech Synthesis Speech synthesis: the generation of speech Speech synthesis: the generation of speech Useful - natural and familiar way of receiving information Useful - natural and familiar way of receiving information Problems - similar to recognition: prosody particularly Problems - similar to recognition: prosody particularly Additional problems Additional problems intrusive - needs headphones, or creates noise in the workplace intrusive - needs headphones, or creates noise in the workplace transient - harder to review and browse transient - harder to review and browse

22 Speech Synthesis Successful in certain constrained applications, usually when the user is particularly motivated to overcome the problems and has few alternative screen readers - read the textual display to the user utilised by visually impaired people warning signals - spoken information sometimes presented to pilots whose visual and haptic skills are already fully occupied

23 Non-Speech Sounds Boings, bangs, squeaks, clicks etc. commonly used in interfaces to provide warnings and alarms Evidence to show they are useful fewer typing mistakes with key clicks video games harder without sound Dual mode displays: information presented along two different sensory channels

24 Non-Speech Sounds Allows for redundant presentation of information Allows resolution of ambiguity in one mode through information in another Sound is especially good for transient information, and background status information Language/culture independent, unlike speech example: Sound can be used as a redundant mode in modern GUI’s (Windows, Mac); almost any user action (file selection, window active, disk insert, search error, copy complete, etc.) can have a different sound associated with it.

25 Auditory Icons Use natural sounds to represent different types of object or action Natural sounds have associated semantic which can be mapped onto similar meanings in the interaction e.g. throwing something away ~ the sound of smashing glass

26 Auditory Icons Problem: not all things have associated meanings e.g. copying application Items and actions on the desktop have associated sounds folders have a papery noise moving files is accompanied by a dragging sound copying - a problem sound of a liquid being poured into a receptacle the rising pitch indicates the progress of the copy? big files have a louder sound than smaller ones? Additional information can also be presented: muffled sounds if object is obscured or action is in the background use of stereo allows positional information to be added

27

28 Earcons Synthetic sounds used to convey information Synthetic sounds used to convey information Structured combinations of notes (motives ) represent actions and objects Structured combinations of notes (motives ) represent actions and objects Motives combined to provide rich information Motives combined to provide rich information compound earcons compound earcons multiple motives combined to make one more complicated earcon multiple motives combined to make one more complicated earcon

29 Earcons family earcons similar types of earcons represent similar classes of action or similar objects: the family of “errors” would contain syntax and operating system errors cf: Wagnerian leitmotif Earcons easily grouped and refined due to compositional and hierarchical nature Harder to associate with the interface task since there is no natural mapping

30 Handwriting recognition Handwriting is another communication mechanism which we are used to Handwriting consists of complex strokes and spaces Captured by digitising tablet - strokes transformed to sequence of dots

31 Handwriting recognition Large tablets suitable for digitising maps and technical drawings Tablet PCs – eg: Toshiba Portege M200 Smaller devices incorporating thin screens to display the information e.g. PDAs such as Palm Pilot [GRAFFITI® 2] Recognition Problems personal differences in letter formation co-articulation effects Some success for systems trained on a few users, with separated letters Generic multi-user naturally- written text recognition systems … still some way off!

32 Text and Hypertext Text is a common form of output, and very useful in many situations imposes a strict linear progression on the reader, Hypertext structures blocks of text into a mesh or network that can be traversed in many different ways allows a user to follow their own path through information hypertext systems comprise: a number of pages, and links, that allow one page to be accessed from another example: technical manual for a photocopier all the technical words linked to their definition in a glossary links between similar photocopiers

33 Hypermedia Hypermedia systems are hypertext systems that incorporate additional media, such as illustrations, photographs, video and sound Particularly useful for educational purposes animation and graphics allow user to see things happen hypertext structure allows users to explore at their own pace Problems: “lost in hyperspace” - users unsure where in the web they are maps of the hypertext are a partial solution incomplete coverage of information some routes through the hypertext miss critical chunks difficult to print out and take away printed documents require a linear structure

34 Animation the addition of motion to images - they change and move in time examples: clocks Digital faces - seconds flick past Analogue face - second hand sweeps round constantly Salvador Dali clock - digits warp and melt into each other http://www.lares.dti.ne.jp/~yugo/storage/monocrafts_ver3/03/index.html cursor hourglass/watch/spinning disc indicates the system is busy flashing cursor indicates typing position

35 Animation used to to indicate temporally-varying information. Useful in education and training: allow users to see things happening, as well as being interesting and entertaining images in their own right example: data visualisation abrupt and smooth changes in multi-dimensional data visualised using animated, coloured surfaces complex molecules and their interactions more easily understood when they are rotated and viewed on the screen Animation (cont’d)

36 Utilising animation and video Animation and video are potentially powerful tools notice the success of PC & Console games Now outstripping Hollywood Feature Films! $7.2 billion dollars in 2004 However, the standard approaches to interface design do not take into account the full possibilities of such media We will probably only start to reap the full benefit from this technology when we have much more experience. We also need to learn from the masters of this new art form: interface designers will need to acquire the skills of film makers and cartoonists as well as artists and writers.

37 Applications Users with special needs have specialised requirements which are often well-served by multimedia and/or multimodal systems visual impairment screen readers, SonicFinder physical disability speech input, gesture recognition, predictive systems (e.g. Reactive keyboard) learning disabilities (e.g. dyslexia) speech input, output Virtual Reality Multimedia multimodal interaction at its most extreme headsets allow user to “see” the virtual world gesture recognition achieved with DataGlove eyegaze allows users to indicate direction with eyes alone


Download ppt "Non-Traditional Interfaces CSCI324, IACT403, IACT 931, MCS9324 Human Computer Interfaces."

Similar presentations


Ads by Google