Download presentation
1
Real-Time 3D Model Acquisition
Princeton University Stanford University Szymon Rusinkiewicz Olaf Hall-Holt Marc Levoy
2
3D Scanning 3D scanning becoming widely used, both in traditional areas like model building for movies and in new areas such as art history and archeology.
3
Possible Research Goals
Low noise Guaranteed high accuracy High speed Low cost Automatic operation No holes There are many research goals for 3D scanning. In this paper we focus on one that hasn’t been explicitly addressed too often: coming up with a hole-free model.
4
3D Model Acquisition Pipeline
3D Scanner To understand the problem, let’s look at the entire 3D model acquisition pipeline. We start with the 3D scanner, which typically returns a range image: 3D shape as seen from a single point of view.
5
3D Model Acquisition Pipeline
3D Scanner View Planning In order to get the entire object, need to move scanner around object (or move object w.r.t. scanner). This requires figuring out where to put the scanner.
6
3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Multiple scans need to be aligned.
7
3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Merging The scans are then merged into a single model.
8
3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Done? Merging Now, the user needs to plan further scans and determine whether the entire object has been covered.
9
3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Done? Merging To do this, user needs some sort of feedback, so the final piece of the pipeline is some sort of rendering. Display
10
3D Model Acquisition Difficulties
Much (often most) time spent on “last 20%” Pipeline not optimized for hole-filling Not sufficient just to speed up scanner – must design pipeline for fast feedback With most versions of the previous pipeline, most of the effort goes into hole filling. Each iteration of the pipeline is slow, because the user has to do a scan, align it, render, set up another scan, etc. The key point of this paper is that to speed up the pipeline and make it easier to use, it’s not enough to just focus on the scanner – it’s necessary to design the entire pipeline to provide feedback to the user as efficiently as possible.
11
Real-Time 3D Model Acquisition
[Video clip of hand moving object, then view of screen. 40 secs, no sound] Here is a new pipeline that *is* designed for fast feedback to the user. The user moves the object around by hand (notice the black glove so we don’t get fingers in the scan). You can always see the current state of the model on the screen, and can move the object to fill any holes.
12
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Human Here’s how we design this new pipeline. We want to retain the human in the loop, doing view planning, since that’s what humans do best. However, we want to have the rest of the pipeline be designed so as to use the human’s time most efficiently: we don’t want the user to be waiting around for scans to complete. Done? Merging Display
13
Real-Time 3D Model Acquisition Pipeline
3D Scanner Alignment View Planning Merging Challenge: Real Time So, the challenge is to design the rest of the pipeline to run in real time – to provide feedback to the user as quickly as possible. In the remainder of this talk I’ll focus on the different stages of this pipeline. These all use variations of techniques that have been proposed previously, but by picking particular algorithms that work well together in this real-time pipeline, we really appear to get something that’s more than the sum of its parts. Done? Display
14
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Part I: Structured-Light Triangulation As you saw in the video, the real-time system uses a range scanner based on structured light. The novel thing about it is that it has to work on objects that are moving during the scanning. Done? Merging Display
15
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Part II: Fast ICP Next, we align the geometry from different views using a real-time variant of a classic algorithm called ICP. Done? Merging Display
16
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Part III: Voxel Grid The merging and rendering stages are fairly simple: they are based on binning samples in a voxel grid. Done? Merging Display
17
Triangulation Project laser stripe onto object Object Laser Camera
The first stage in the real-time pipeline is the range scanner. It’s based on the idea of triangulation. In the simplest case, this consists of projecting a stripe of light onto a scene, looking at it from an angle… Project laser stripe onto object
18
Triangulation Depth from ray-plane triangulation Object Laser Camera
(x,y) … and triangulating between a plane from the point of view of the light source and a ray from the point of view of the camera. In the simplest case of a laser triangulation scanner, this yields data from a single contour on the object at a time, and you can sweep the line across the surface to stack up a bunch of these contours and get a scan of an entire patch of surface. Depth from ray-plane triangulation
19
Triangulation Faster acquisition: project multiple stripes
Correspondence problem: which stripe is which? For our real-time pipeline, though, we want to go faster – we need to get as much data at once as possible. The obvious way to do this is to project a bunch of stripes at once, but then we need to figure out which stripe is which.
20
Continuum of Triangulation Methods
Single-frame Single-stripe Multi-stripe Multi-frame There’s a continuum of methods we can use to establish these correspondences. At one extreme are single-stripe systems. These never have the problem of ambiguity, so they produce very high-quality data, but they take a long time to do it. At the other extreme are systems that try to get all the data at once by projecting lots of stripes (or, in this case, dots). They are fast, since they get everything with just a single frame, but they tend to be a lot more fragile and get confused by discontinuities. In the middle are methods that use a few frames to get depth, by flashing stripes on and off over time, and conveying a code about which stripe is which through this pattern of on/off flashing. Slow, robust Fast, fragile
21
Time-Coded Light Patterns
Assign each stripe a unique illumination code over time [Posdamer 82] Time Just as an example, here’s a very simple code based on binary numbers. If you look at a single position in space (i.e., a single pixel), you see a certain on/off pattern over these four frames. This conveys a code that tells you which stripe you are looking at, which gives you a plane in space with which you can then triangulate to find depths. Space
22
Codes for Moving Scenes
Assign time codes to stripe boundaries Perform frame-to-frame tracking of corresponding boundaries Propagate illumination history [Hall-Holt & Rusinkiewicz, ICCV 2001] In our system, though, the objects are moving from frame to frame, so if you just used the simple algorithm you might get half of one code and half of another, and come up with the wrong code. So, we need to do some tracking from frame to frame. In fact, we do this based on looking at the boundaries between stripes. The stripes are tracked from frame to frame, and at the end of the day the illumination on both sides of the boundary is what conveys the code. Illumination history = (WB),(BW),(WB) Code
23
Designing a Code Want many “features” to track: lots of black/white edges at each frame Try to minimize ghosts – WW or BB “boundaries” that can’t be seen directly There’s an extra little wrinkle, though. If you’re looking for the boundaries between stripes, sometimes you have a “boundary” between two stripes of the same color, which we’ll call a ghost. So, some stripes are easy to track, because you can see them in both frames, but in some cases you have to infer the presence of a stripe you can’t see directly. In order to make this at all feasible, we have to design a code that tries to minimize these ghosts and ensures, for example, that a ghost in one frame becomes visible in the next.
24
[Hall-Holt & Rusinkiewicz, ICCV 2001]
Designing a Code 0000 1101 1010 0111 1111 0010 0101 1000 [Either omit this slide or just display it for a couple of seconds, as comic relief] Designing this code turns into a graph problem – since I’m sure you’ve all solved it in your heads by now, I’ll just move on and won’t waste your time with it… 1011 0110 0001 1100 0100 1001 1110 0011 [Hall-Holt & Rusinkiewicz, ICCV 2001]
25
Implementation Pipeline: DLP projector illuminates scene @ 60 Hz.
Synchronized NTSC camera captures video Pipeline returns range 60 Hz. Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range So here’s how the range scanner looks. We project stripes at 60 Hz., capture video frames at the same rate, and process them to get range images at 60 Hz.
26
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Part II: Fast ICP Next, we align the geometry from different views using a real-time variant of a classic algorithm called ICP. Done? Merging Display
27
Aligning 3D Data This range scanner can be used for any moving objects
For rigid objects, range images can be aligned to each other as object moves So far, I haven’t assumed that we’re looking at rigid objects. For this real-time model acquisition pipeline, though, we assume that the range images are different views of the same rigid objects, so we can align them to each other.
28
Aligning 3D Data ICP (Iterative Closest Points): for each point on one scan, minimize distance to closest point on other scan… We do this using a well-known algorithm called ICP. The idea is that for each point on the (upper) green scan, we find the closest point on the lower (red) scan, and minimize the distances between these pairs of points. This brings the scans into closer alignment, …
29
Aligning 3D Data … and iterate to find alignment
Iterated Closest Points (ICP) [Besl & McKay 92] … and we can iterate this process until it converges.
30
ICP in the Real-Time Pipeline
Potential problem with ICP: local minima In this pipeline, scans close together Very likely to converge to correct (global) minimum Basic ICP algorithm too slow (~ seconds) Point-to-plane minimization Projection-based matching With these tweaks, running time ~ milliseconds [Rusinkiewicz & Levoy, 3DIM 2001] There are a couple of things that we need to consider when applying the basic ICP algorithm to our pipeline. First, a traditional problem with ICP is that it only converges to a local minimum, not necessarily the global minimum. In our pipeline, though, this turns out not to be a problem because we get range images at 60 Hz., so they tend to be very close to each other. The other issue is speed: the basic algorithm as I’ve described it takes on the order of seconds to converge. There are a couple of tweaks we can make, though: one changes the function that is minimized and the other replaces the closest-point finding with an approximation. With these tweaks, the algorithm becomes fast enough to fit in our pipeline.
31
Real-Time 3D Model Acquisition Pipeline
3D Scanner View Planning Alignment Part III: Voxel Grid The merging and rendering stages are fairly simple: they are based on binning samples in a voxel grid. Done? Merging Display
32
Merging and Rendering Goal: visualize the model well enough to be able to see holes Cannot display all the scanned data – accumulates linearly with time Standard high-quality merging methods: processing time ~ 1 minute per scan The third piece of the real-time pipeline is merging and rendering. This is necessary, as we saw, to provide feedback to the user. We can’t use the “standard” methods (typically based on combining volumetric signed distance functions or on Voronoi diagrams on the 3D point cloud) because they take too long. So, instead, we adopt a much simpler algorithm.
33
Merging and Rendering The idea is that we start with a range image, …
34
Merging and Rendering … quantize the samples to a grid in 3D, …
35
Merging and Rendering … and compute a surface normal at each voxel.
36
+ Merging and Rendering
We can then incorporate the samples corresponding to a new range image with the data we already have in the grid, ….
37
Merging and Rendering … and come up with a merged result. We then do splat rendering from this combined grid, using the averaged normals for lighting. Point rendering, using accumulated normals for lighting
38
Example: Photograph 18 cm.
Here’s another example of the whole pipeline in action. This is a little turtle figurine, about 15 cm. long. 18 cm.
39
Result [Video clip] Here’s what the user sees while scanning the turtle. Notice how easy it is to see and fill holes in this bumpy back.
40
Postprocessing Real-time display
Quality/speed tradeoff Goal: let user evaluate coverage, fill holes Offline postprocessing for high-quality models Global registration High-quality merging (e.g., using VRIP [Curless 96]) Of course, the quality of the real-time pipeline doesn’t really compare with other scanners – there’s simply not enough CPU to be able to run high-quality algorithms. On the other hand, it doesn’t need to – it only needs to be good enough to let the user see holes in the data. Later on, once the user is sure that all the data has been captured, we can run a high-quality postprocess. This can take as long as it needs to, since the user is no longer in the loop.
41
Postprocessed Model So, here’s the turtle after postprocessing. In this case we ran global registration, and produced a merged model using VRIP.
42
Recapturing Alignment
43
Summary 3D model acquisition pipeline optimized for obtaining complete, hole-free models Use human’s time most efficiently Pieces of pipeline selected for real-time use: Structured-light scanner for moving objects Fast ICP variant Simple grid-based merging, point rendering
44
Limitations Prototype noisier than commercial systems
Could be made equivalent with careful engineering Ultimate limitations on quality: focus, texture Scan-to-scan ICP not perfect alignment drift Due to noise, miscalibration, degenerate geometry Reduced, but not eliminated, by “anchor scans” Possibly combine ICP with separate trackers
45
Future Work Faster scanning Application in different contexts
Better stripe boundary tracking Multiple cameras, projectors High-speed cameras, projectors Application in different contexts Cart- or shoulder-mounted for digitizing rooms Infrared for imperceptibility
46
Acknowledgments Collaborators: Sponsors: Li-Wei He James Davis
Lucas Pereira Sean Anderson Sponsors: Sony Intel Interval
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.