Download presentation
Presentation is loading. Please wait.
Published byGary Summers Modified over 9 years ago
1
1 Spoken Language Support for Software Development Andrew Begel Advisor: Susan L. Graham Computer Science Division, EECS University of California, Berkeley
2
2 Motivation Programmers conventionally use keyboard –Long hours at keyboard leads to higher risk of RSI Can a programmer code using speech? Can a computer understand what the developer says? while (counter < limit) { }
3
3 Programming by Voice My Goal 1.Find out how developers use code verbally. Use this to develop a naturally verbalizable input form. 2.Build development environment that supports verbal authoring, navigation, modification. Extend conventional compiler analyses to support ambiguities generated by speech. 3.Learn how developers can use voice-based programming, and iterate design. while counter is less than limit do...
4
4 Challenges HSpeech is inherently ambiguous. HProgramming tools were not designed for ambiguity. HSpeech tools are poorly suited for programming tasks. HProgrammers are not used to verbal software development.
5
5 Talk Outline Introduction and Motivation Programming by Voice Program Analyses for Ambiguous Inputs SPEech EDitor Programming Environment SPEED User Study Conclusion
6
6 How do Programmers Speak Code? 10 programmers read Java code out loud (Begel ‘05) –Graduate students in Computer Science –Five knew Java, five did not –Five were native English speakers, five were not –Five were educated in U.S.A., five were not Read pre-written code into tape recorder –As if speaking to a sophomore-level CS undergrad who knows Java, but does not know the program Most programmers spoke the same way
7
7 for int i equals zero i less than ten i plus plus for (int i = 0; i < 10; i++ ) { ▌ } How do Programmers Speak Code?
8
8 Spoken Words Can Be Hard To Write Down 2 How Do Programmers Speak Code?
9
9 Spoken Words Can Be Hard To Write Down 2 2, two, to, too How Do Programmers Speak Code?
10
10 Spoken Words Can Be Hard To Write Down 2 2, two, to, too print How Do Programmers Speak Code?
11
11 Spoken Words Can Be Hard To Write Down 2 2, two, to, too print print, Print How Do Programmers Speak Code?
12
12 Spoken Words Can Be Hard To Write Down 2 2, two, to, too print print, Print drop stack process How Do Programmers Speak Code?
13
13 Spoken Words Can Be Hard To Write Down 2 2, two, to, too print print, Print drop stack process drop stack process drop stackprocess dropstack process dropstackprocess How Do Programmers Speak Code?
14
14 Many Ways to Say the Same Thing bar[i] How Do Programmers Speak Code?
15
15 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar How Do Programmers Speak Code?
16
16 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. How Do Programmers Speak Code?
17
17 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. period, dot How Do Programmers Speak Code?
18
18 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. period, dot } How Do Programmers Speak Code?
19
19 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. period, dot } right brace, close the if, end method How Do Programmers Speak Code?
20
20 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. period, dot } right brace, close the if, end method println How Do Programmers Speak Code?
21
21 Many Ways to Say the Same Thing bar[i] bar sub i, bar of i, i from bar. period, dot } right brace, close the if, end method println print line, print lin, print l n How Do Programmers Speak Code?
22
22 One Utterance May Mean Many Things object stack How Do Programmers Speak Code?
23
23 One Utterance May Mean Many Things object stack Object stack; object.stack object(stack) object().stack() How Do Programmers Speak Code?
24
24 One Utterance May Mean Many Things object stack Object stack; object.stack object(stack) object().stack() array sub i plus plus How Do Programmers Speak Code?
25
25 One Utterance May Mean Many Things object stack Object stack; object.stack object(stack) object().stack() array sub i plus plus array[i]++ array[i++] How Do Programmers Speak Code?
26
26 People Have Trouble Saying Some Things System.out.println How Do Programmers Speak Code?
27
27 People Have Trouble Saying Some Things System.out.println system out print line system dot out print line system dot out dot print line How Do Programmers Speak Code?
28
28 People Have Trouble Saying Some Things System.out.println system out print line system dot out print line system dot out dot print line (int)foo How Do Programmers Speak Code?
29
29 People Have Trouble Saying Some Things System.out.println system out print line system dot out print line system dot out dot print line (int)foo cast foo to integer int foo cast something to integer. that something is foo. How Do Programmers Speak Code?
30
30 Sometimes They Describe the Code And then there’s a class. How Do Programmers Speak Code?
31
31 Sometimes They Describe the Code And then there’s a class. Set all the fields of that object to null. How Do Programmers Speak Code?
32
32 Sometimes They Describe the Code And then there’s a class. Set all the fields of that object to null. All of these are just assignment operations. How Do Programmers Speak Code?
33
33 Design Tradeoffs Command Language Easy to analyze, but prescriptive Natural Language Flexible, but ambiguous Programming by Voice
34
34 Programming by Voice Related Work Human-CentricComputer-Centric Multiple Tasks Authoring Only Arnold ‘00 Snell ‘00 Price ‘00 ‘02 Desilets ‘01 ‘04 Gray ‘03 Begel ‘05
35
35 A More Natural Way to Code public class symbol implements serializable public class Symbol implements Serializable { ▌ }
36
36 A More Natural Way to Code static hash map hash table gets new hash map public class Symbol implements Serializable { static HashMap hashtbl = new HashMap(); ▌ }
37
37 A More Natural Way to Code end the class public class Symbol implements Serializable { static HashMap hashtbl = new HashMap(); } ▌
38
38 A More Natural Way to Code for int i equals zero i less than ten i plus plus for (int i = 0; i < 10; i++ ) { ▌ }
39
39 Too Many Ambiguities for (int i = 0; i < 10; i++ ) { ▌ } 4 int eye equals 0 aye less then ten i plus plus KW or #? Spelling of ID? KW or ID? for int i equals zero i less than ten i plus plus
40
40 Sometimes It’s Non-Obvious for (times = 8; file(2, load); times == one) { ▌ } for times equals 8 file 2 load times equals one fore *= 8; file.tooLode.times = won ▌ 4; times = ate(file).to(load).equals(1) ▌
41
41 Spoken Java Semantically identical to Java Syntactically easier to say than Java –Methodology generalizable to any computer language 1.All punctuation has English equivalents Open Brace, End For Loop 2.Most punctuation is optional 3.Provide verbalization for all abbreviations 4.Relaxed phrasing for better fit with English (int)foo “cast foo to integer” foo = 6 “set foo to 6” foo[i]++ “increment the ith element of array foo”
42
42 SPEED: Speech Editor Build an editor that supports naturally verbalized programs SPEED: SPEech EDitor Based on IBM ViaVoice, Eclipse IDE, Harmonia –Spoken Java Language for Composition –Spoken Command language for Navigation, Editing, Template instantiation, Refactorings, Search –Audible and visual feedback Similar to JavaSpeak (Smith 2000)
43
43 Harmonia Analysis Framework Framework to support interactive editors –Language-based, programmer-oriented tools Incremental analyses –Lexing (Wagner ‘97), GLR Parsing (Wagner ‘97, Begel ‘04), Static Semantics (Garrison ‘87, Begel, Jamison) C, Java, Titanium, Cool, Flex, Bison –Also, languages where indentation and CRs are significant Interactive Program Transformations (Boshernitsan) CodeLink (Toomim et. al. ‘04) Shorthand Editing
44
44 Talk Outline Introduction and Motivation Programming by Voice Program Analyses for Ambiguous Inputs SPEech EDitor Programming Environment SPEED User Study Conclusion
45
45 Traditional Compiler Analyses Lexical Analysis FORI ParsingSemantic Analysis for (i = 0; i < 10; i++ ) { } Programming languages are designed to be unambiguous For Loop FOR Assign Expr I=0 i Local Var int
46
46 Ambiguity-Aware Analyses for i equals zero... Handles input stream, syntactic and semantic ambiguities Lexical Analysis FORI Ambiguous Parsing Semantic Ambiguity Resolution For Loop FOR Assign Expr I=0 i Local Var int four eye Local Var ? 4EYE FOUREYE Assign Expr =0 Ambig Stmt
47
47 Scan Input Stream Commercial Speech Recognizer Homophone Dictionary Lexical Analysis
48
48 Homophones Cause Ambiguities foriequals for i = 4 i equals foreeye == fore ayeequalsfoureyeequals foriequals Concatenated words cause them too 4 fore fou r eye aye = ==
49
49 Ambiguity-Aware Analyses for i equals zero... Lexical Analysis FORI XGLR Ambiguous Parsing Semantic Ambiguity Resolution For Loop FOR Assign Expr I=0 i Local Var int four eye Local Var ? 4EYE FOUREYE Assign Expr =0 Ambig Stmt
50
50 XGLR Parsing [Begel 04] <XIFFIFTYFIVE
51
51 XGLR Parsing [ Begel 04 ] <XIFFIFTYFIVE KW
52
52 XGLR Parsing [ Begel 04 ] <XFIFTYFIVE IF KW
53
53 XGLR Parsing [ Begel 04 ] <XFIFTYFIVEIF KW FIFTY5 50FIVE 505 55 ID # # # # # <X <X <X <X
54
54 XGLR Parsing [ Begel 04 ] FIFTYFIVE IF KW FIFTY5 50FIVE 5 ID # 50 # 55 # # ID # IF KW IF KW IF KW IF KW <X <X <X <X <X Op
55
55 XGLR Parsing [ Begel 04 ] FIFTYFIVEIF KW FIFTY5 50FIVE 505 55 ID # # # # # IF KW IF KW IF KW IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW FIFTY ID IF KW.. <X <X <X <X X< Op
56
56 XGLR Parsing [ Begel 04 ] FIFTYFIVEIF KW FIFTY5 50FIVE 505 55 ID # # # # # IF KW IF KW IF KW IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW FIFTY ID IF KW.. <X <X <X <X X< Op FIVE 5 5 # ID # <X <X <X <X
57
57 XGLR Parsing [ Begel 04 ] FIFTYFIVEIF KW FIFTY5 50FIVE 505 55 ID # # # # # IF KW IF KW IF KW IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW FIFTY ID IF KW.. <X <X <X <X X< Op FIVE 5 5 # ID # <X <X <X <X
58
58 XGLR Parsing [ Begel 04 ] FIFTYFIVEIF KW FIFTY5 50FIVE 505 55 ID # # # # # IF KW IF KW IF KW IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW FIFTY ID IF KW.. <X <X <X <X X< Op FIVE 5 5 # ID # <X <X <X <X
59
59 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X
60
60 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID. FIVE ID FIFTYIF KW ID. FIVE ID.. ( ( <X <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X
61
61 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID. FIVE ID FIFTYIF KW ID. FIVE ID.. ( ( <X <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID
62
62 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID ( FIVE ID FIFTYIF KW ID. FIVE ID FIFTYIF KW ID. FIVE ID.. ( ( <X <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID
63
63 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID
64
64 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID
65
65 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID
66
66 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW ID. X < Op FIVE ID <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID <X <X Op ID
67
67 XGLR Parsing [ Begel 04 ] 55 # IF KW FIFTYIF KW FIFTY ID IF KW ( ( FIFTYIF KW ID. X < Op FIVE 5 ID # <X <X <X FIFTYIF KW FIFTY ID IF KW ( ( FIVE 5 # ID ) ) <X <X Op ID Expr FuncCall Expr FuncCall
68
68 XGLR Summary Generalization of traditional GLR algorithm –Forks on structural and lexical ambiguity –Preserves subtree sharing when parses have different yields –Retains efficiency when parses get out of sync Determine parse position w.r.t. ambiguous input Blender: Combined lexer and parser generator for XGLR
69
69 GLR Parsing Genealogy Tomita 1985 Farshi 1991 Rekers 1992 Wagner 1997 Visser 1997 van den Brand 2002 Begel 2004 Incremental Scannerless Johnstone, Scott 2002 Input Stream Ambiguities
70
70 Ambiguity-Aware Analyses for i equals zero... Lexical Analysis FORI XGLR Ambiguous Parsing Semantic Ambiguity Resolution For Loop FOR Assign Expr I=0 i Local Var int four eye Local Var ? 4EYE FOUREYE Assign Expr =0 Ambig Stmt
71
71 Disambiguation Example class Loader { public void load() { String filetoload = null; InputStream stream = getStream();... ▌ } file to load equals stream dot read string filetoload = stream.readString();
72
72 Many Interpretations file(2, lowed) file(to, load) file(to.lode) file(to(lode)) file(toload) (file, 2, load) file.to.lowed file.to(load) file.toload filetoload() filetoload
73
73 Incremental Semantics What does this name mean? What names are visible at this program point? –Or, What can I say here? Visibility Graph [Garrison 1987] –Incrementally updated data structure for scopes, names and bindings –Designed Visibility Graph algorithms for name propagation and incremental update –Used for type checking, too Doesn’t do this?
74
74 Program Context Can Help class Loader { public void load() { String filetoload = null; InputStream stream = getStream();... ▌ } class Loader scope [ load, Method, () void ] method load scope [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ]
75
75 Program Context Can Help class Loader { public void load() { String filetoload = null; InputStream stream = getStream();... ▌ } class Loader scope [ load, Method, () void ] method load scope [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ] [ load, Method, () void ]
76
76 Semantic Disambiguation file(2, lowed) file(to, load) file(to.lode) file(to(lode)) file(toload) (file, 2, load) file.to.lowed file.to(load) file.toload filetoload() filetoload class Loader scope [ load, Method, () void ] method load scope [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ] [ load, Method, () void ]
77
77 Semantic Disambiguation file(2, lowed) file(to, load) file(to.lode) file(to(lode)) file(toload) (file, 2, load) file.to.lowed file.to(load) file.toload filetoload() filetoload class Loader scope [ load, Method, () void ] method load scope Is “ file ” a visible variable name? [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ] [ load, Method, () void ]
78
78 Semantic Disambiguation file(2, lowed) file(to, load) file(to.lode) file(to(lode)) file(toload) filetoload() filetoload class Loader scope [ load, Method, () void ] method load scope Is “ file ” a visible method name? [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ] [ load, Method, () void ]
79
79 Semantic Disambiguation filetoload() filetoload class Loader scope [ load, Method, () void ] method load scope Is “ filetoload ” a visible method name? [ filetoload, LocalVar, String ] [ stream, LocalVar, InputStream ] [ load, Method, () void ]
80
80 Manual Disambiguation Some ambiguities cannot (and should not) be automatically resolved: print(“line”) vs. println() if (pred1) then if (pred2) then foo() else bar() If ambiguities remain, ask the user how to resolve them. (e.g. [Mankoff 00]) if foo() if bar() if foo() bar()
81
81 Talk Outline Introduction and Motivation Programming by Voice Program Analyses for Ambiguous Inputs SPEech EDitor Programming Environment SPEED User Study Conclusion
82
82 SPEED Editor
83
83 Speech Editing Model Code Template Insertion Toggle Microphone
84
84 Speech Editing Model Choose From Alternatives
85
85 Speech Editing Model
86
86 Speech Editing Model
87
87 Context-Sensitive Mouse Grid
88
88 What Can I Say/Type?
89
89 Cache Pad
90
90 Talk Outline Introduction and Motivation Programming by Voice Program Analyses for Ambiguous Inputs SPEech EDitor Programming Environment SPEED User Study Conclusion
91
91 Study - SPEED Usability Goal: Understand how SPEED can be used by expert programmers Hypothesis: SPEED is learnable and usable for standard programming tasks 1.Train 5 expert Java programmers on SPEED 2.Create and modify code –Build a Linked List data structure with associated algorithms 3 programmers used commercial speech recognizer 2 programmers used human speech recognizer
92
92 Metrics Number of Commands/Dictations Features Used –Code Templates, Dictation, Navigation, Editing, Fixing Mistakes Quantity and Kinds of Mistakes –Speech Recognition, SPEED, User
93
93 Results Accuracy of commercial speech recognizers was horrible (25-50%). Human SR was much better (10-20%). Recognition delay was equal for both recognizers (0.5-0.75 sec)
94
94 Results Commands were easy to learn and remember. –Very few user mistakes Most commands spoken for code templates and editing. –GOMS analysis predicts speech will be slower until you can get a lot of text for each utterance Speakers were apprehensive about speaking code instead of describing it via code templates.
95
95 Study Conclusion SPEED is learnable in a short amount of time Programming-by-voice is slower than typing –Programmers would not want to use it until they had to Programmers believed they would be efficient enough using SPEED to remain in software engineering jobs
96
96 Talk Outline Introduction and Motivation Programming by Voice Program Analyses for Ambiguous Inputs SPEech EDitor Programming Environment SPEED User Study Conclusion
97
97 Contributions 1.A study of programmers to understand and design a naturally verbalizable input for programming 2.An interactive editor designed for spoken interaction 3.The use of syntax and semantics of programming for disambiguation – Enhanced lexical, syntactic, semantic analyses for support of verbal ambiguities 4.Evaluation of design and tools by studying programmers using voice for software development
98
98 SPEED Next Steps Add more code templates –Enable users to write their own Add “Jump To ” Find new ways to edit strings by voice Integrate speech recognition with other IDE features –GUI, code completion, debugger
99
99 Further SPEED Studies Develop methodology for short-term voice recognition studies Find out why programmers felt code dictation was weird. Evaluate more complex code editing operations by voice Evaluate context-sensitive mouse grid usability
100
100 Future of Programming by Voice 1.Improved automation of semantic disambiguation – Use ideas from NLP, Machine Learning (team styles) 2.Early pruning of ambiguities using analysis feedback 3.Higher-level linguistic programming tools – Transformations, Paraphrasing – Phonetic search, Audible feedback 4.Support more software engineering tasks by voice – Debuggers, IDEs, Comments, Code reviews 5.Design spoken variants of other formal languages – General (C, C#) Scripting (PL, OS), Design (HCI), Command (Robotics), Domain-specific languages (SQL)
101
101 Any Questions? Andrew Begel: abegel@cs.berkeley.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.