Download presentation
Presentation is loading. Please wait.
1
Motivation Where is my W-2 Form?
2
Video-based Tracking Camera view of the desk Camera Overhead video camera
3
Example 40 minutes, 1024x768 @ 15 fps
4
System Overview User Input Video
5
System Overview User Internal Representation Video Analysis Vision Engine Desk TT+1 Input Video
6
Query Interface System Overview Query (where is my W-2 form?) Internal Representation Input Video User Video Analysis Desk TT+1
7
System Overview Query (where is my W-2 form?) Internal Representation Input Video User Video Analysis Answer Query Interface Desk TT+1
8
System Overview User Internal Representation Video Analysis Input Video Vision Engine Desk TT+1
9
Vision Problem … …
10
Event … …
11
Vision Problem … Event … … Desk
12
Vision Problem Desk … … Event … … Desk
13
Vision Problem … … tut-article.pdf sanders01.pdf objectspaces.pdfkidd94.pdf lowe04sift.pdf Event … … Desk
14
Vision Problem … … Event Scene Graph (DAG) … … tut-article.pdf sanders01.pdf objectspaces.pdfkidd94.pdf lowe04sift.pdf Desk
15
Assumptions Simplifying –Corresponding electronic copy exists
16
Assumptions Simplifying –Corresponding electronic copy exists –3 event types: move/entry/exit –One document at a time –Only topmost document can move –No duplicate copies of same document
17
Assumptions Simplifying –Corresponding electronic copy exists –3 event types: move/entry/exit –One document at a time –Only topmost document can move –No duplicate copies of same document Other –Desk need not be initially empty
18
Algorithm Overview Input Frames … … Event Detection Event Interpretation Document Recognition Scene Graph Update
19
Algorithm Overview Input Frames … … Event Detection Event Interpretation Document Recognition beforeafter Scene Graph Update
20
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter Scene Graph Update
21
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
22
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
23
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
24
Event Detection … …
25
time Frame Difference … …
26
Event Detection time Threshold Event Frames time … … Frame Difference
27
Event Detection beforeafter Event Frames … …
28
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
29
Event Interpretation Move beforeafter
30
Event Interpretation Move Entry beforeafter
31
Event Interpretation Move Entry Exit beforeafter
32
Event Interpretation Move Entry Exit Motion: (x,y,θ) beforeafter
33
Event Interpretation Move Entry Exit 1. Move vs. Entry/Exit beforeafter
34
Event Interpretation Move Entry Exit 2. Entry vs. Exit beforeafter
35
Event Interpretation Use SIFT [Lowe 99] –Scale Invariant Feature Transform –Distinctive feature descriptor –Reliable object recognition
36
Move vs. Entry/Exit before after
37
Move vs. Entry/Exit before after
38
Move vs. Entry/Exit before after
39
Move vs. Entry/Exit before after
40
Move vs. Entry/Exit before after
41
Move vs. Entry/Exit before after
42
Move vs. Entry/Exit before after
43
Move vs. Entry/Exit before after
44
Move vs. Entry/Exit before after
45
Move vs. Entry/Exit before after
46
Move vs. Entry/Exit before after
47
Move vs. Entry/Exit before after Motion: (x,y,θ)
48
Entry vs. Exit beforeafter Example 1 (entry)
49
Entry vs. Exit beforeafter Example 1 (entry)
50
Entry vs. Exit beforeafter Example 1 (entry)
51
Entry vs. Exit beforeafter … File1.pdf File2.pdfFile3.pdfFile4.pdfFile5.pdfFile6.pdf …
52
Entry vs. Exit beforeafter Example 2 (entry)
53
Entry vs. Exit beforeafter … File1.pdf File2.pdfFile3.pdfFile4.pdfFile5.pdfFile6.pdf …
54
Entry vs. Exit beforeafter
55
Entry vs. Exit beforeafter ?
56
Entry vs. Exit beforeafter ?
57
Entry vs. Exit beforeafter ……
58
Entry vs. Exit beforeafter …… Amount of change time
59
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
60
Document Recognition … File1.pdf File2.pdfFile3.pdfFile4.pdfFile5.pdfFile6.pdf Match against PDF image database …
61
Document Recognition … File1.pdf File2.pdfFile3.pdfFile4.pdfFile5.pdfFile6.pdf Match against PDF image database Performance –Can differentiate between ~100 (or more) documents –~200x300 pixels per document for reliable match …
62
Algorithm Overview Input Frames … … Event Detection Event Interpretation “A document moved from (x 1,y 1 ) to (x 2,y 2 )” Document Recognition beforeafter File1.pdf File2.pdf File3.pdf Scene Graph Update
63
before after Motion: (x,y,θ) Desk
64
Scene Graph Update before after Motion: (x,y,θ) Desk
65
Scene Graph Update before after Motion: (x,y,θ) Desk
66
Scene Graph Update before after Motion: (x,y,θ) Desk ?
67
Scene Graph Update before after Motion: (x,y,θ) Desk
68
Scene Graph Update Desk …
69
Scene Graph Update Desk …
70
Photo Sorting Example
71
Current Directions Handle more realistic desktops Speed up processing time Other useful functionalities –Written annotation –Version management –Bookmark –Multi-user queries
72
For More Information Our publications –Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. The Office of the Past. IEEE Workshop on Real-time Vision for HCI, 2004. –Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. Video-based Document Tracking: Unifying Your Physical and Electronic Desktops. To appear in Proceedings of UIST, 2004. Other related work –David G. Lowe. Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 2004. –Pierre Wellner. Interacting with paper on the DigitalDesk. Communications of the ACM, 36(7):86.97,1993. –D. Rus and P. deSantis. The self-organizing desk. In Proceedings of International Joint Conference on Artificial Intelligence, 1997.
73
For More Information Our publications –Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. The Office of the Past. IEEE Workshop on Real-time Vision for HCI, 2004. –Jiwon Kim, Steven M. Seitz, Maneesh Agrawala. Video-based Document Tracking: Unifying Your Physical and Electronic Desktops. To appear in Proceedings of UIST, 2004. Other related work –David G. Lowe. Object recognition from local scale-invariant features. International Conference on Computer Vision, 1999. –Pierre Wellner. Interacting with paper on the DigitalDesk. Communications of the ACM, 36(7):86.97,1993. –D. Rus and P. deSantis. The self-organizing desk. In Proceedings of International Joint Conference on Artificial Intelligence, 1997.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.