Real-time Acquisition and Rendering of Large 3D Models Szymon Rusinkiewicz
Computer Graphics Pipeline Shape 3D Scanning Shape Rendering Motion Lighting and Reflectance Human time = expensive Sensors = cheap Computer graphics increasingly relies on measurements of the real world
3D Scanning Applications Computer graphics Product inspection Robot navigation As-built floorplans Product design Archaeology Clothes fitting Art history
The Digital Michelangelo Project Push state of the art in range scanning and demonstrate applications in art and art history Working in the museum Scanning geometry Scanning color
Traditional Range Scanning Pipeline High-quality, robust pipeline for producing 3D models: Scan object with laser triangulation scanner: many views from different angles Align pieces into single coordinate frame: initial manual alignment, refined with ICP Merge overlapping regions: compute “average” surface using VRIP [Curless & Levoy 96] Display resulting model One of the results of the DMich project was a complete, high-quality pipeline
3D Scan of David: Statistics Good news: we were able to create a large, high-res model of a complex object Bad news: took a lot of time; gantry was custom-designed, expensive, cumbersome; a lot of manual effort Over 5 meters tall 1/4 mm resolution 22 people 30 nights of scanning Efficiency max : min = 8 : 1 Needed view planning Weight of gantry: 800 kg Putting model together: 1000+ man-hours and counting 0:30
New 3D Scanning Pipeline Need for a fast, inexpensive, easy-to-use 3D scanning system Wave a (small, rigid) object by hand in front of the scanner Automatically align data as it is acquired Let user see partial model as it is being built – fill holes Key idea: focus on the whole pipeline instead of trying to optimize individual pieces Shorten the feedback loop with the user Real-Time Model Acquisition
Real-Time 3D Model Acquisition Prototype real-time model acquisition system 3D scanning of moving objects Fast alignment Real-time merging and display
Applications of Easy-to-Use 3D Model Acquisition Advertising More capabilities in Photoshop Movie sets Augmented reality User interfaces
3D Scanning Technologies Contact-based: touch probes Passive: shape from stereo, motion, shading Active: time-of-flight, defocus, photometric stereo, triangulation Triangulation systems are inexpensive, robust, and flexible Take advantage of trends in DLP projectors Many of these suitable for real-time use Real-time triangulation system using custom hw from CMU, Sony
Laser Triangulation Project laser stripe onto object Object Laser Camera Project laser stripe onto object
Laser Triangulation Depth from ray-plane triangulation Object Laser Camera (x,y) Depth from ray-plane triangulation
Triangulation Faster acquisition: project multiple stripes Correspondence problem: which stripe is which?
Triangulation Single-frame Single-stripe Multi-stripe Multi-frame Slow, robust Fast, fragile
Time-Coded Light Patterns Assign each stripe a unique illumination code over time [Posdamer 82] Time Space
Gray-Code Patterns To minimize effects of quantization error: each point may be a boundary only once Time Space
Structured-Light Assumptions Structured-light systems make certain assumptions about the scene: Spatial continuity assumption: Assume scene is one object Project a grid, pattern of dots, etc. Temporal continuity assumption: Assume scene is static Assign stripes a code over time
Codes for Moving Scenes We make a different assumption: Object may move Velocity low enough to permit tracking “Spatio-temporal” continuity
Codes for Moving Scenes Code stripe boundaries instead of stripes Perform frame-to-frame tracking of corresponding boundaries Propagate illumination history [Hall-Holt & Rusinkiewicz, ICCV 2001] Illumination history = (WB),(BW),(WB) Code
New Scanning Pipeline Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range
Designing a Code Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Biggest problem is ghosts – WW or BB “boundaries” that can’t be seen directly
Designing a Code Design a code to make tracking possible: Do not allow two spatially adjacent ghosts Do not allow two temporally adjacent ghosts t
Designing a Code Graph (for 4 frames): Nodes: stripes (over time) 0011 1110 1011 0110 0100 1001 0001 1100 0000 1101 1010 0111 1000 0101 1111 0010 Edges: boundaries (over time) Nodes: stripes (over time) Space Time
Designing a Code Graph (for 4 frames): 0011 1110 1011 0110 0100 1001 0001 1100 0000 1101 1010 0111 1000 0101 1111 0010 Nodes: stripes (over time) Boundary visible at even times Boundary visible at odd times Edges: boundaries (over time) Path with alternating colors: 55 edges in graph maximal-length traversal has 110 boundaries (111 stripes)
Image Capture Standard video camera: fields at 60 Hz Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Standard video camera: fields at 60 Hz Genlock camera to projector
Finding Boundaries Standard edge detection problem Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Standard edge detection problem Current solution: find minima and maxima of intensity, boundary is between them
Matching Stripe Boundaries Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Even if number of ghosts is minimized, matching is not easy ?
Matching Stripe Boundaries Resolve ambiguity by constraining maximum stripe velocity Could accommodate higher speeds by estimating velocities Could take advantage of methods in tracking literature (e.g., Kalman filters)
Decoding Boundaries Propagate illumination history Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Propagate illumination history Table lookup based on illumination history and position in four-frame sequence Once a stripe has been tracked for at least four frames, it contributes useful data on every subsequent frame
Computing 3D Position Ray-plane intersection Requires calibration of: Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range Ray-plane intersection Requires calibration of: Camera, projector intrinsics Relative position and orientation
Results Video frames Stripe boundaries unknown known ghosts
Boundary codes and tracking Results Single range image of moving object Shows importance of doing some sort of tracking Top View Front View Top View Front View Gray codes, no tracking Boundary codes and tracking
Aligning 3D Data This range scanner can be used for any moving objects For rigid objects, range images can be aligned to each other as object moves
Aligning 3D Data If correct correspondences are known, it is possible to find correct relative rotation/translation
Aligning 3D Data How to find corresponding points? Previous systems based on user input, feature matching, surface signatures, etc.
Aligning 3D Data Alternative: assume closest points correspond to each other, compute the best transform…
Aligning 3D Data … and iterate to find alignment Iterated Closest Points (ICP) [Besl & McKay 92] Converges if starting position “close enough“
ICP Variants Classic ICP algorithm not real-time To improve speed: examine stages of ICP and evaluate proposed variants [Rusinkiewicz & Levoy, 3DIM 2001] Selecting source points (from one or both meshes) Matching to points in the other mesh Weighting the correspondences Rejecting certain (outlier) point pairs Assigning an error metric to the current transform Minimizing the error metric
ICP Variant – Point-to-Plane Error Metric Using point-to-plane distance instead of point-to-point lets flat regions slide along each other more easily [Chen & Medioni 91]
Finding Corresponding Points Finding closest point is most expensive stage of ICP Brute force search – O(n) Spatial data structure (e.g., k-d tree) – O(log n) Voxel grid – O(1), but large constant, slow preprocessing
Finding Corresponding Points For range images, simply project point [Blais 95] Constant-time, fast Does not require precomputing a spatial data structure
High-Speed ICP Algorithm ICP algorithm with projection-based correspondences, point-to-plane matching can align meshes in a few tens of ms. (cf. over 1 sec. with closest-point)
Anchor Scans Alignment of consecutive scans leads to accumulation of ICP errors Alternative: align all scans to an “anchor” scan, only switch anchor when overlap low Given anchor scans, restart after failed ICP becomes easier
Merging and Rendering Goal: visualize the model well enough to be able to see holes Cannot display all the scanned data – accumulates linearly with time Standard high-quality merging methods: processing time ~ 1 minute per scan
Merging and Rendering Real-time incremental merging and rendering: Quantize samples to a 3D grid Maintain average normal of all points at a grid cell Point (splat) rendering Can be made hierarchical to conserve memory
Photograph
Real-time Scanning Demo
Postprocessing Goal of real-time display is to let user evaluate coverage, fill holes Quality/speed tradeoff Offline postprocessing for high-quality models
Merged Result Photograph Aligned scans Merged
Future Work Technological improvements: Pipeline improvements: Use full resolution of projector Higher-resolution cameras Ideas from design of single-stripe 3D scanners Pipeline improvements: Better detection of failed alignment Better handling of object texture – combine with stereo? Global registration to eliminate drift More sophisticated merging Improve user interaction during scanning
Future Work Faster scanning Application in different contexts Better stripe boundary matching Multiple cameras, projectors High-speed cameras Application in different contexts Small, hand-held Cart- or shoulder-mounted for digitizing rooms Infrared for imperceptibility
Rendering of Large Models Range scanners increasingly capable of producing very large models DMich models are 100 million to 1 billion samples Challenge: how to allow viewing in real time Fast startup, progressive loading Traditional answer: triangle meshes, simplification, hardware-accelerated rendering Impractical for such large models Alternative: revisit basic data structure QSplat [Rusinkiewicz & Levoy, SIGGRAPH 00]
QSplat Key observation: a single bounding sphere hierarchy can be used for Hierarchical frustum and backface culling Level of detail control Splat rendering [Westover 89] A bsphere hierarchy, properly augmented, allows you to discard polygons, connectivity
QSplat Node Structure Position and Radius Width of Cone of Normals Tree Structure Color (Optional) Normal 13 bits 3 bits 14 bits 2 bits 16 bits 6 bytes
QSplat Node Structure Position and Radius Width of Cone of Normals Tree Structure Color (Optional) Normal 13 bits 3 bits 14 bits 2 bits 16 bits Position and radius encoded relative to parent node Hierarchical coding vs. delta coding along a path for vertex positions Center Offset Radius Ratio
QSplat Node Structure Uncompressed Position and Radius Width of Cone of Normals Tree Structure Color (Optional) Normal 13 bits 3 bits 14 bits 2 bits 16 bits Uncompressed
QSplat Node Structure Delta Coding [Deering 96] Position and Radius Width of Cone of Normals Tree Structure Color (Optional) Normal 13 bits 3 bits 14 bits 2 bits 16 bits Delta Coding [Deering 96]
QSplat Node Structure Hierarchical Coding Position and Radius Width of Cone of Normals Tree Structure Color (Optional) Normal 13 bits 3 bits 14 bits 2 bits 16 bits Hierarchical Coding
QSplat Rendering Algorithm Traverse hierarchy recursively Hierarchical frustum / backface culling if (node not visible) Skip this branch else if (leaf node) Draw a splat else if (size on screen < threshold) else Traverse children Level of detail control Point rendering Adjusted to maintain desired frame rate
Demo – St. Matthew 3D scan of 2.7 meter statue at 0.25 mm 102,868,637 points File size: 644 MB Preprocessing time: 1 hour
Future Work Splats as primitive High-level visibility / LOD frameworks Unify rendering of meshes, volumes, point clouds Compatible with shading after rasterization Hybrid point/polygon systems High-level visibility / LOD frameworks Store different kinds of data at each node: alpha, BRDF, scattering function, etc. Potentially could be used to unify image-based-rendering (IBR) techniques
Contributions Real-time 3D model acquisition system Video-rate 3D scanner for moving objects Analysis of ICP variants; real-time algorithm Real-time merging and rendering Allows user to see model and fill holes QSplat: interactive rendering of large 3D meshes Single data structure used for visibility culling, level-of-detail control, point rendering, compression Extension to network streaming [I3D 2001]
Acknowledgments Olaf Hall-Holt Lucas Pereira The Original DMich Gang: Dave Koller, Sean Anderson, James Davis, Kari Pulli, Matt Ginzton, Jon Shade DMich, the next generation: Gary King, Steve Marschner Graphics lab Advisor: Marc Levoy Committee: Pat Hanrahan, Leo Guibas, Mark Horowitz, Bernd Girod Family, friends Sponsors: NSF, Interval, Honda, Sony, Intel