Download presentation
Presentation is loading. Please wait.
Published byJasmin Cameron Modified over 8 years ago
1
Ink Analysis Richard Anderson CSE 481b Winter 2007
2
Announcements VisualStudio Express Free, Hobbyist version of VS 2005 Presentations, Tuesday Jan 23 15 minute presentation + 3 minutes discussion PowerPoint slides Group order: A, B, C, D
3
Today’s lecture Handwriting Recognition Structure Recognition Classification Annotation JNT Note Format
4
Ink Analysis for Search Output Mapping of search results to source Reflect underlying structure Handle different types of search queries Raw text Boolean Typed queries (“481 as a course number”) Object queries (Course numbers) Environment (List, Prose, Mathematics,... )
5
Ink Analysis Pipeline Filter Structure Recognize Classify Annotate
6
Handwriting Recognition: Identify the following words
7
Recognition results
8
Recognizer Architecture 88868226357 4446157 23 92 31 5194720 711252879 13 53 18 79 28576 … … … 13 81 82143 17 5743 90 7 16 57 914415 Output Matrix dog68 clog57 dug51 doom42 divvy37 ooze35 cloy34 doxy29 client22 dozy13 Ink Segments Top 10 List d 00 a 00 b 00 c 00 o 09 a 73 l 07 t 5 g 68 t 8 b 6 o 12 g 57 t 12 TDNN a b d o g a b t t c l o g t Lexicon e a … … … … … Beam Search a b d e g h n o 4 5 3 90 12 4 14 7 Slide from Jay Pittman, Microsoft
9
Recognizer Training Collect large set of training data Samples of known inputs that can be used to set “weights” in reco engine Needed to build a recognizer Dictionary Language samples Commercial recognizers based on massive data sets
10
Tablet PC Recognition API Basic idea: Ink In, Text Out
11
Recognition Code I private Recognizers recognizers; private Recognizer recognizer; public Form1() { InitializeComponent(); this.inkCollector = new InkCollector(this.inkPanel.Handle); this.inkCollector.Enabled = true; this.recognizers = new Recognizers(); this.recognizer = recognizers.GetDefaultRecognizer(); }
12
Recognition Code II private void OnRecoClick(object sender, EventArgs e) { RecognizerContext recoContext = this.recognizer.CreateRecognizerContext(); recoContext.Factoid = GetFactoid(); recoContext.Strokes = this.inkCollector.Ink.Strokes; recoContext.EndInkInput(); RecognitionStatus recoStatus; RecognitionResult recoResult = recoContext.Recognize(out recoStatus); if (recoStatus != RecognitionStatus.NoError) return; string result = recoResult.TopString; RecognitionAlternate topAlt = recoResult.TopAlternate;
13
Factoids Bias the recognizer towards certain types of content DEFAULT CURRENCY NUMBER TELEPHONE EMAIL UPPERCHAR
14
Reading Journal Notes Journal Reader to import.JNT.JNT -> XML -> Custom Format Journal format gives an initial parsing You may want to undo this parsing and work with ink at the page level
15
JNT Format Journal Document List of Journal Pages Journal Page List of Content Content Journal Drawing, Journal Paragraph, other stuff
16
Journal Drawing Uninterpreted Ink Base64String if (childNode.Name.ToLower().Equals("inkobject")) { string base64Ink = childNode.InnerText; ink = new Ink(); ink.Load(Convert.FromBase64String(base64Ink)); }
17
Text Structure JournalParagraph List of JournalLines JournalLine List of JournalInkWords JournalInkWord Alternate List Uninterpreted Ink
18
Shape recognition Surprisingly challenging because of drawing artifacts Open figures Multiple strokes Imprecise corners Arrows
19
Structure recognition
20
Basic approach for structure recognition Grouping by rectangular region Heuristics for separating regions White space Separating lines
21
General approach to recognition/classification Extraction of features Objects become points in high dimensional space Construct mapping from features to classes
22
Clustering
23
Learning
24
Heuristic Programmatically determine classification based on features
25
Classification Identify different types of text Mathematics Prose Lists Brainstorming Code Domains Chemistry, Physics, Algorithms,
26
Classification
27
Annotation Identify annotation marks Highlighted text Circles Check marks Cross out
28
Annotation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.