Real-time Acquisition and Rendering of Large 3D Models Szymon Rusinkiewicz
Computer Graphics Pipeline Human time = expensiveHuman time = expensive Sensors = cheapSensors = cheap – Computer graphics increasingly relies on measurements of the real world RenderingRendering ShapeShape LightingandReflectanceLightingandReflectance MotionMotion Shape 3D Scanning
3D Scanning Applications Computer graphicsComputer graphics Product inspectionProduct inspection Robot navigationRobot navigation As-built floorplansAs-built floorplans Product design Archaeology Clothes fitting Art history
The Digital Michelangelo Project Push state of the art in range scanning and demonstrate applications in art and art historyPush state of the art in range scanning and demonstrate applications in art and art history Working in the museum ScanninggeometryScanningcolor
Traditional Range Scanning Pipeline High-quality, robust pipeline for producing 3D models:High-quality, robust pipeline for producing 3D models: – Scan object with laser triangulation scanner: many views from different angles – Align pieces into single coordinate frame: initial manual alignment, refined with ICP – Merge overlapping regions: compute “average” surface using VRIP [Curless & Levoy 96] – Display resulting model
3D Scan of David: Statistics Over 5 meters tallOver 5 meters tall 1/4 mm resolution1/4 mm resolution 22 people22 people 30 nights of scanning30 nights of scanning Efficiency max : min = 8 : 1Efficiency max : min = 8 : 1 – Needed view planning Weight of gantry: 800 kgWeight of gantry: 800 kg Putting model together: man-hours and countingPutting model together: man-hours and counting
New 3D Scanning Pipeline Need for a fast, inexpensive, easy-to-use 3D scanning systemNeed for a fast, inexpensive, easy-to-use 3D scanning system Wave a (small, rigid) object by hand in front of the scannerWave a (small, rigid) object by hand in front of the scanner Automatically align data as it is acquiredAutomatically align data as it is acquired Let user see partial model as it is being built – fill holesLet user see partial model as it is being built – fill holes Real-Time Model Acquisition
Real-Time 3D Model Acquisition Prototype real-time model acquisition systemPrototype real-time model acquisition system – 3D scanning of moving objects – Fast alignment – Real-time merging and display
Applications of Easy-to-Use 3D Model Acquisition AdvertisingAdvertising More capabilities in PhotoshopMore capabilities in Photoshop Movie setsMovie sets Augmented realityAugmented reality User interfacesUser interfaces
3D Scanning Technologies Contact-based: touch probesContact-based: touch probes Passive: shape from stereo, motion, shadingPassive: shape from stereo, motion, shading Active: time-of-flight, defocus, photometric stereo, triangulationActive: time-of-flight, defocus, photometric stereo, triangulation Triangulation systems are inexpensive, robust, and flexibleTriangulation systems are inexpensive, robust, and flexible – Take advantage of trends in DLP projectors
Laser Triangulation Project laser stripe onto objectProject laser stripe onto object Object Laser CameraCamera
CameraCamera Laser Triangulation Depth from ray-plane triangulationDepth from ray-plane triangulation Object Laser (x,y)
Triangulation Faster acquisition: project multiple stripesFaster acquisition: project multiple stripes Correspondence problem: which stripe is which?Correspondence problem: which stripe is which?
Triangulation Slow, robustFast, fragile Multi-stripe Multi-frame Single-frame Single-stripe
Time-Coded Light Patterns Assign each stripe a unique illumination code over time [Posdamer 82]Assign each stripe a unique illumination code over time [Posdamer 82] Space Time
Gray-Code Patterns To minimize effects of quantization error: each point may be a boundary only onceTo minimize effects of quantization error: each point may be a boundary only once Space Time
Structured-Light Assumptions Structured-light systems make certain assumptions about the scene:Structured-light systems make certain assumptions about the scene: Spatial continuity assumption:Spatial continuity assumption: – Assume scene is one object – Project a grid, pattern of dots, etc. Temporal continuity assumption:Temporal continuity assumption: – Assume scene is static – Assign stripes a code over time
Codes for Moving Scenes We make a different assumption:We make a different assumption: – Object may move – Velocity low enough to permit tracking – “Spatio-temporal” continuity
Illumination history = (WB),(BW),(WB) CodeCode Codes for Moving Scenes Code stripe boundaries instead of stripesCode stripe boundaries instead of stripes Perform frame-to-frame tracking of corresponding boundariesPerform frame-to-frame tracking of corresponding boundaries – Propagate illumination history [Hall-Holt & Rusinkiewicz, ICCV 2001] [Hall-Holt & Rusinkiewicz, ICCV 2001]
New Scanning Pipeline ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange
Designing a Code Biggest problem is ghosts – WW or BB “boundaries” that can’t be seen directlyBiggest problem is ghosts – WW or BB “boundaries” that can’t be seen directly ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange
Designing a Code Design a code to make tracking possible:Design a code to make tracking possible: – Do not allow two spatially adjacent ghosts – Do not allow two temporally adjacent ghosts t
Designing a Code Graph (for 4 frames):Graph (for 4 frames): – Edges: boundaries (over time) – Nodes: stripes (over time) SpaceTime
Designing a Code Graph (for 4 frames):Graph (for 4 frames): Path with alternating colors: 55 edges in graph maximal-length traversal has 110 boundaries (111 stripes)Path with alternating colors: 55 edges in graph maximal-length traversal has 110 boundaries (111 stripes) – Edges: boundaries (over time) Boundary visible at even times Boundary visible at odd times – Nodes: stripes (over time)
Image Capture Standard video camera: fields at 60 HzStandard video camera: fields at 60 Hz Genlock camera to projectorGenlock camera to projector ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange
Finding Boundaries ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange Standard edge detection problemStandard edge detection problem Current solution: find minima and maxima of intensity, boundary is between themCurrent solution: find minima and maxima of intensity, boundary is between them
Matching Stripe Boundaries ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange Even if number of ghosts is minimized, matching is not easyEven if number of ghosts is minimized, matching is not easy ?
Matching Stripe Boundaries Resolve ambiguity by constraining maximum stripe velocityResolve ambiguity by constraining maximum stripe velocity Could accommodate higher speeds by estimating velocitiesCould accommodate higher speeds by estimating velocities Could take advantage of methods in tracking literature (e.g., Kalman filters)Could take advantage of methods in tracking literature (e.g., Kalman filters)
Decoding Boundaries ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange Propagate illumination historyPropagate illumination history Table lookup based on illumination history and position in four-frame sequenceTable lookup based on illumination history and position in four-frame sequence – Once a stripe has been tracked for at least four frames, it contributes useful data on every subsequent frame
Computing 3D Position Ray-plane intersectionRay-plane intersection Requires calibration of:Requires calibration of: – Camera, projector intrinsics – Relative position and orientation ProjectCodeProjectCodeCaptureImagesCaptureImagesFindBoundariesFindBoundariesMatchBoundariesMatchBoundariesDecodeDecodeComputeRangeComputeRange
Results Videoframes Stripeboundaries unknownknownghosts
Results Single range image of moving objectSingle range image of moving object Gray codes, no tracking Boundary codes and tracking Top View Front View
Aligning 3D Data This range scanner can be used for any moving objectsThis range scanner can be used for any moving objects For rigid objects, range images can be aligned to each other as object movesFor rigid objects, range images can be aligned to each other as object moves
Aligning 3D Data If correct correspondences are known, it is possible to find correct relative rotation/translationIf correct correspondences are known, it is possible to find correct relative rotation/translation
Aligning 3D Data How to find corresponding points?How to find corresponding points? Previous systems based on user input, feature matching, surface signatures, etc.Previous systems based on user input, feature matching, surface signatures, etc.
Aligning 3D Data Alternative: assume closest points correspond to each other, compute the best transform…Alternative: assume closest points correspond to each other, compute the best transform…
Aligning 3D Data … and iterate to find alignment… and iterate to find alignment – Iterated Closest Points (ICP) [Besl & McKay 92] Converges if starting position “close enough“Converges if starting position “close enough“
ICP Variants Classic ICP algorithm not real-timeClassic ICP algorithm not real-time To improve speed: examine stages of ICP and evaluate proposed variantsTo improve speed: examine stages of ICP and evaluate proposed variants [Rusinkiewicz & Levoy, 3DIM 2001] 1.Selecting source points (from one or both meshes) 2.Matching to points in the other mesh 3.Weighting the correspondences 4.Rejecting certain (outlier) point pairs 5.Assigning an error metric to the current transform 6.Minimizing the error metric
ICP Variant – Point-to-Plane Error Metric Using point-to-plane distance instead of point-to-point lets flat regions slide along each other more easily [Chen & Medioni 91]Using point-to-plane distance instead of point-to-point lets flat regions slide along each other more easily [Chen & Medioni 91]
Finding Corresponding Points Finding closest point is most expensive stage of ICPFinding closest point is most expensive stage of ICP – Brute force search – O(n) – Spatial data structure (e.g., k-d tree) – O(log n) – Voxel grid – O(1), but large constant, slow preprocessing
Finding Corresponding Points For range images, simply project point [Blais 95]For range images, simply project point [Blais 95] – Constant-time, fast – Does not require precomputing a spatial data structure
High-Speed ICP Algorithm ICP algorithm with projection-based correspondences, point-to-plane matching can align meshes in a few tens of ms. (cf. over 1 sec. with closest-point)ICP algorithm with projection-based correspondences, point-to-plane matching can align meshes in a few tens of ms. (cf. over 1 sec. with closest-point)
Anchor Scans Alignment of consecutive scans leads to accumulation of ICP errorsAlignment of consecutive scans leads to accumulation of ICP errors Alternative: align all scans to an “anchor” scan, only switch anchor when overlap lowAlternative: align all scans to an “anchor” scan, only switch anchor when overlap low Given anchor scans, restart after failed ICP becomes easierGiven anchor scans, restart after failed ICP becomes easier
Merging and Rendering Goal: visualize the model well enough to be able to see holesGoal: visualize the model well enough to be able to see holes Cannot display all the scanned data – accumulates linearly with timeCannot display all the scanned data – accumulates linearly with time Standard high-quality merging methods: processing time ~ 1 minute per scanStandard high-quality merging methods: processing time ~ 1 minute per scan
Merging and Rendering Real-time incremental merging and rendering:Real-time incremental merging and rendering: – Quantize samples to a 3D grid – Maintain average normal of all points at a grid cell – Point (splat) rendering – Can be made hierarchical to conserve memory
Photograph
Real-time Scanning Demo
Postprocessing Goal of real-time display is to let user evaluate coverage, fill holesGoal of real-time display is to let user evaluate coverage, fill holes – Quality/speed tradeoff Offline postprocessing for high-quality modelsOffline postprocessing for high-quality models
Merged Result Photograph Aligned scans Merged
Future Work Technological improvements:Technological improvements: – Use full resolution of projector – Higher-resolution cameras – Ideas from design of single-stripe 3D scanners Pipeline improvements:Pipeline improvements: – Better detection of failed alignment – Better handling of object texture – combine with stereo? – Global registration to eliminate drift – More sophisticated merging – Improve user interaction during scanning
Future Work Faster scanningFaster scanning – Better stripe boundary matching – Multiple cameras, projectors – High-speed cameras Application in different contextsApplication in different contexts – Small, hand-held – Cart- or shoulder-mounted for digitizing rooms – Infrared for imperceptibility
Rendering of Large Models Range scanners increasingly capable of producing very large modelsRange scanners increasingly capable of producing very large models – DMich models are 100 million to 1 billion samples Challenge: how to allow viewing in real timeChallenge: how to allow viewing in real time – Fast startup, progressive loading Traditional answer: triangle meshes, simplification, hardware-accelerated renderingTraditional answer: triangle meshes, simplification, hardware-accelerated rendering – Impractical for such large models Alternative: revisit basic data structureAlternative: revisit basic data structure QSplat [Rusinkiewicz & Levoy, SIGGRAPH 00]
QSplat Key observation: a single bounding sphere hierarchy can be used forKey observation: a single bounding sphere hierarchy can be used for – Hierarchical frustum and backface culling – Level of detail control – Splat rendering [Westover 89]
QSplat Node Structure PositionandRadius TreeStructure Normal Width of Cone of Normals Color(Optional) 13 bits 3 bits 14 bits 2 bits 16 bits 6 bytes
QSplat Node Structure Position and radius encoded relative to parent node – Hierarchical coding vs. delta coding along a path for vertex positions PositionandRadius TreeStructure Normal Width of Cone of Normals Color(Optional) 13 bits 3 bits 14 bits 2 bits 16 bits Center Offset Radius Ratio
QSplat Node Structure PositionandRadius TreeStructure Normal Width of Cone of Normals Color(Optional) 13 bits 3 bits 14 bits 2 bits 16 bits Uncompressed
QSplat Node Structure PositionandRadius TreeStructure Normal Width of Cone of Normals Color(Optional) 13 bits 3 bits 14 bits 2 bits 16 bits Delta Coding [Deering 96]
QSplat Node Structure PositionandRadius TreeStructure Normal Width of Cone of Normals Color(Optional) 13 bits 3 bits 14 bits 2 bits 16 bits Hierarchical Coding
QSplat Rendering Algorithm Traverse hierarchy recursivelyTraverse hierarchy recursively if (node not visible) Skip this branch else if (leaf node) Draw a splat else if (size on screen < threshold) Draw a splat else Traverse children Hierarchical frustum / backface culling Point rendering Adjusted to maintain desired frame rate Level of detail control
Demo – St. Matthew 3D scan of 2.7 meter statue at 0.25 mm3D scan of 2.7 meter statue at 0.25 mm 102,868,637 points102,868,637 points File size: 644 MBFile size: 644 MB Preprocessing time: 1 hourPreprocessing time: 1 hour
Future Work Splats as primitiveSplats as primitive – Unify rendering of meshes, volumes, point clouds – Compatible with shading after rasterization – Hybrid point/polygon systems High-level visibility / LOD frameworksHigh-level visibility / LOD frameworks – Store different kinds of data at each node: alpha, BRDF, scattering function, etc. – Potentially could be used to unify image-based-rendering (IBR) techniques
Contributions Real-time 3D model acquisition systemReal-time 3D model acquisition system – Video-rate 3D scanner for moving objects – Analysis of ICP variants; real-time algorithm – Real-time merging and rendering – Allows user to see model and fill holes QSplat: interactive rendering of large 3D meshesQSplat: interactive rendering of large 3D meshes – Single data structure used for visibility culling, level-of-detail control, point rendering, compression – Extension to network streaming [I3D 2001]
Acknowledgments Olaf Hall-HoltOlaf Hall-Holt Lucas PereiraLucas Pereira The Original DMich Gang: Dave Koller, Sean Anderson, James Davis, Kari Pulli, Matt Ginzton, Jon ShadeThe Original DMich Gang: Dave Koller, Sean Anderson, James Davis, Kari Pulli, Matt Ginzton, Jon Shade DMich, the next generation: Gary King, Steve MarschnerDMich, the next generation: Gary King, Steve Marschner Graphics labGraphics lab Advisor: Marc LevoyAdvisor: Marc Levoy Committee: Pat Hanrahan, Leo Guibas, Mark Horowitz, Bernd GirodCommittee: Pat Hanrahan, Leo Guibas, Mark Horowitz, Bernd Girod Family, friendsFamily, friends Sponsors: NSF, Interval, Honda, Sony, IntelSponsors: NSF, Interval, Honda, Sony, Intel