Download presentation
Presentation is loading. Please wait.
1
OverLay: Practical Mobile Augmented Reality
Hello everyone, Good morning! It is my great pleasure in speaking here at XXX about my work. My name is Puneet Jain. Thank you for attending my talk and inviting me here. And I will look forward meeting many of you later today. Puneet Jain Duke University/UIUC Justin Manweiler IBM Research Romit Roy Choudhury UIUC
2
Last year’s tax statements
Idea Mobile Augmented Reality Last year’s tax statements Allow random indoor object tagging Others should be able to retrieve Faulty Monitor Wish Mom Birthday Return CDs Mobile AR refers to ones ability to scan the surroundings via smartphones camera and see virtual information associated to the object on phone’s screen. The information could be anything from text annotations, web URLs to audio or video. This vision is certainly not new and have existed for a while now. Many designers, science fiction writers, and researchers have imagined several applications which could exist if this is realized into a holistic system.
3
Introduction Going forward, I would set ones expectation from Mobile AR. I have video demonstration of an AR system which help in understanding our objective better.
4
Why not a solved problem?
Need to understand today’s approaches Vision Sensing The obvious question is why? – to answer why.. We need to look at current generation approaches.. Mobile AR is currently done in two ways.. Vision/Image based AR or Sensing based AR… both of these approaches are necessary but none of them are sufficient .. Both necessary but not sufficient
5
Accurate Algorithms are Slow
Vision Sensing Feature Extraction Feature Matching Note than accuracy is most important in case of Mobile AR. Unlike google search where any match similar to a given image is OK, mobile AR require exact match. Also, no two similar looking things are. One exit sign is different from another exit sign in the same building since they can indicate different things. Accurate Algorithms are Slow
6
Matching latency too high for real-time
Vision Sensing Offloading + GPU For 100 image DB Matching ≈ 1 s Extraction ≈ 29 ms Network ≈ 302 ms GPU on Cloud Matching latency too high for real-time
7
Vision Sensing Not possible indoors Requires User Location
Brunelleschi's dome Requires User Location Requires Precise Orientation Talk about how new objects would be added and how inaccuracies in sensing quickly detail this. Requires Object Location Not possible indoors
8
Vision Sensing Accurate/Slow Quick/Inaccurate Indoor Location
Clearly there are tradeoffs between accuracy and latency … sensing and computer vision… offload or not to offload…and in todays talk.. That’s what our primary agenda is … Indoor Location But, Indoor Localization is not always available Can accelerate Vision Prerequisites for Sensing
9
Location-free AR Natural pause, turn, walk indicate spatial-relationships between tags 10 seconds C D 110° 5 seconds 80° B 7 seconds Lets look a how people would use in a museum scenario. An user walks across the museum and looks at paintings on the way. Few natural usage patterns emerge here -- possibly indicating separation between the tags. A Sensors can help in building such geometric layouts Geometry, instead of location, can be used to reduce computation burden on vision
10
Primary Challenge: Matching Latency
Temporal Relationships Rotational Relationships
11
Temporal Relationships
ROTATIONAL Temporal Relationships E D T=21, saw E C T=15, saw C Temporal separations can be captured on cloud B TAB ≤ 7 + ETAB TAB ≥ 7 – ETAB TAC ≤ 15 + ETAC TAC ≥ 15 – ETAC ETAB, ETAC, TAB, TAC≥ 0 When phone is moving toward C, C can be prioritized for the matching.. Similarly when phone turns away from D, it can be removed from the candidate set T=7, saw B A T=0, saw A
12
Solving for Typical Time
TEMPORAL ROTATIONAL Solving for Typical Time
13
Using temporal Relationships
ROTATIONAL Using temporal Relationships E D T=TCURRENT C B EAB TCURRENT - TA A T=TA if ((TCURRENT – TA) + ETAB > TAB ) - Shortlist Time when the object is viewed
14
Rotational Relationships
TEMPORAL ROTATIONAL Rotational Relationships E D C Gyroscope captures angular changes 90° clockwise B When phone is moving toward C, C can be prioritized for the matching.. Similarly when phone turns away from D, it can be removed from the candidate set 110° anti-clockwise RB – RA ≤ 20° + ERBA RB – RA ≥ 20° – ERBA RC – RA ≤ 130° + ERCA RC – RA ≥ 130° – ERCA ERBA, ERCA, RA, RB, RC ≥ 0 A 20° anti-clockwise
15
Using Rotational Relationships
TEMPORAL ROTATIONAL Using Rotational Relationships E D RCURRENT = RA + Gyro Gyro RD B RB RE RA RB RCURRENT ERB/2 A B’s rotational distance = RB – RCURRENT + ERB/2 - Pick tags closer in rotational distance
16
OverLay: Converged Architecture
Selected candidates GPU Optimized Pipeline SURF Refine Match frame “Botanist” N E T W O R K Blur? Hand Motion? Frame Diff? Macro-trajectory Linear Program Sensory Geometry (time, orientation) Learning Update modules (frames, sensors) Annotation DB SURF Annotation DB Retrieve Micro-trajectory Spatial reasoning Visual Geometry Select Candidates Annotate This talk (image, “Botanist”)
17
Evaluation Android App/Samsung Galaxy S4
Server: GPU on Cloud 12 Cores, 16G RAM, 6G NVidia GPU 11 Volunteers 100+ Tags 4200 Frame Uploads
18
System Variants Approximate (Quick Computer Vision)
Matching using approximate schemes e.g., KDTree Conservative (Slow Computer Vision) Matching using brute-force schemes OverLay Conservative + Optimizations
19
Optimizations lead to 4 fold improvement
Latency Optimizations lead to 4 fold improvement
20
Accuracy: Precision OverLay ≈ Bruteforce
21
Approximate < OverLay < Bruteforce
Accuracy: Recall Approximate < OverLay < Bruteforce
22
Conclusion Vision and Sensing based ARs
Geometric layouts: Accelerated Vision OverLay: Practical Mobile AR
23
synrg.csl.illinois.edu/projects/MobileAR
Thank you synrg.csl.illinois.edu/projects/MobileAR Puneet Jain Duke University/UIUC Justin Manweiler IBM Research Romit Roy Choudhury UIUC
24
3D-OBJECTS
25
Handling 3D Objects: Learning
Tagged from particular angle Retrieving from different angle
26
Accuracy: After Learning
Recall > Bruteforce and Precision ≈ Bruteforce
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.