Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ink Analysis Richard Anderson CSE 481b Winter 2007.

Similar presentations


Presentation on theme: "Ink Analysis Richard Anderson CSE 481b Winter 2007."— Presentation transcript:

1 Ink Analysis Richard Anderson CSE 481b Winter 2007

2 Announcements VisualStudio Express Free, Hobbyist version of VS 2005 Presentations, Tuesday Jan 23 15 minute presentation + 3 minutes discussion PowerPoint slides Group order: A, B, C, D

3 Today’s lecture Handwriting Recognition Structure Recognition Classification Annotation JNT Note Format

4 Ink Analysis for Search Output Mapping of search results to source Reflect underlying structure Handle different types of search queries Raw text Boolean Typed queries (“481 as a course number”) Object queries (Course numbers) Environment (List, Prose, Mathematics,... )

5 Ink Analysis Pipeline Filter Structure Recognize Classify Annotate

6 Handwriting Recognition: Identify the following words

7 Recognition results

8 Recognizer Architecture 88868226357 4446157 23 92 31 5194720 711252879 13 53 18 79 28576 … … … 13 81 82143 17 5743 90 7 16 57 914415 Output Matrix dog68 clog57 dug51 doom42 divvy37 ooze35 cloy34 doxy29 client22 dozy13 Ink Segments Top 10 List d 00 a 00 b 00 c 00 o 09 a 73 l 07 t 5 g 68 t 8 b 6 o 12 g 57 t 12 TDNN a b d o g a b t t c l o g t Lexicon e a … … … … … Beam Search a b d e g h n o 4 5 3 90 12 4 14 7 Slide from Jay Pittman, Microsoft

9 Recognizer Training Collect large set of training data Samples of known inputs that can be used to set “weights” in reco engine Needed to build a recognizer Dictionary Language samples Commercial recognizers based on massive data sets

10 Tablet PC Recognition API Basic idea: Ink In, Text Out

11 Recognition Code I private Recognizers recognizers; private Recognizer recognizer; public Form1() { InitializeComponent(); this.inkCollector = new InkCollector(this.inkPanel.Handle); this.inkCollector.Enabled = true; this.recognizers = new Recognizers(); this.recognizer = recognizers.GetDefaultRecognizer(); }

12 Recognition Code II private void OnRecoClick(object sender, EventArgs e) { RecognizerContext recoContext = this.recognizer.CreateRecognizerContext(); recoContext.Factoid = GetFactoid(); recoContext.Strokes = this.inkCollector.Ink.Strokes; recoContext.EndInkInput(); RecognitionStatus recoStatus; RecognitionResult recoResult = recoContext.Recognize(out recoStatus); if (recoStatus != RecognitionStatus.NoError) return; string result = recoResult.TopString; RecognitionAlternate topAlt = recoResult.TopAlternate;

13 Factoids Bias the recognizer towards certain types of content DEFAULT CURRENCY NUMBER TELEPHONE EMAIL UPPERCHAR

14 Reading Journal Notes Journal Reader to import.JNT.JNT -> XML -> Custom Format Journal format gives an initial parsing You may want to undo this parsing and work with ink at the page level

15 JNT Format Journal Document List of Journal Pages Journal Page List of Content Content Journal Drawing, Journal Paragraph, other stuff

16 Journal Drawing Uninterpreted Ink Base64String if (childNode.Name.ToLower().Equals("inkobject")) { string base64Ink = childNode.InnerText; ink = new Ink(); ink.Load(Convert.FromBase64String(base64Ink)); }

17 Text Structure JournalParagraph List of JournalLines JournalLine List of JournalInkWords JournalInkWord Alternate List Uninterpreted Ink

18 Shape recognition Surprisingly challenging because of drawing artifacts Open figures Multiple strokes Imprecise corners Arrows

19 Structure recognition

20 Basic approach for structure recognition Grouping by rectangular region Heuristics for separating regions White space Separating lines

21 General approach to recognition/classification Extraction of features Objects become points in high dimensional space Construct mapping from features to classes

22 Clustering

23 Learning

24 Heuristic Programmatically determine classification based on features

25 Classification Identify different types of text Mathematics Prose Lists Brainstorming Code Domains Chemistry, Physics, Algorithms,

26 Classification

27 Annotation Identify annotation marks Highlighted text Circles Check marks Cross out

28 Annotation


Download ppt "Ink Analysis Richard Anderson CSE 481b Winter 2007."

Similar presentations


Ads by Google