Image-based modeling and rendering

In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene.

The traditional approach of computer graphics has been used to create a geometric model in 3D and try to reproject it onto a two-dimensional image. Computer vision, conversely, is mostly focused on detecting, grouping, and extracting features (edges, faces, etc.) present in a given picture and then trying to interpret them as three-dimensional clues. Image-based modeling and rendering allows the use of multiple two-dimensional images in order to generate directly novel two-dimensional images, skipping the manual modeling stage.

Light modeling

Instead of considering only the physical model of a solid, IBMR methods usually focus more on light modeling. The fundamental concept behind IBMR is the plenoptic illumination function which is a parametrisation of the light field. The plenoptic function describes the light rays contained in a given volume. It can be represented with seven dimensions: a ray is defined by its position , its orientation , its wavelength and its time : . IBMR methods try to approximate the plenoptic function to render a novel set of two-dimensional images from another. Given the high dimensionality of this function, practical methods place constraints on the parameters in order to reduce this number (typically to 2 to 4).

IBMR methods and algorithms

  • View morphing generates a transition between images
  • Panoramic imaging renders panoramas using image mosaics of individual still images
  • Lumigraph relies on a dense sampling of a scene
  • Space carving generates a 3D model based on a photo-consistency check

See also

Image-based modeling (IBM)

• Projective methods These techniques exploit projective properties of the scene to reconstruct geometric models directly from a set of photographs (Photo3D [2], PhotoModeler [3], PhotoBuilder [4]). • Tour into the picture Tour into the picture, the simplest image-based modeling technique, recovers, from a single picture, an extremely simplified scene model consisting of just a few texture-mapped polygons [5]. • Façade The Façade system uses a non-linear optimization algorithm to reconstruct 3D textured models of architectural elements from photographs [6]. • Voxel coloring The algorithm identifies a special set of invariant voxels which together form a spatial and photometric reconstruction of the scene able to cope with large changes in visibility and its modeling of intrinsic scene color and texture information, fully consistent with the input images [7]. • Multi-view geometry It is a set of intricate geometric relations between multiple view of a 3D scene, applied to recover 3D models from images [8].

Image-based rendering (IBR)

• Light-field rendering It is a method for generating new views from arbitrary camera positions without depth information or feature matching, simply by combining and resampling the available images. [9]. • Plenoptic stitching It gives the viewer the ability to explore unobstructed environments of arbitrary sizes and shapes, using appropriate sampling for most viewpoints in the environment by moving omnidirectional video camera over the grid [10]. • Cylindrical panoramas It provides horizontal orientation independence when exploring an environment from a single point. Cylindrical panoramas can be created using specialized panoramic cameras [11, 12, 13]. • Concentric mosaics It is a generalization of cylindrical panoramas that allows the viewer to explore a circular region and experience horizontal parallax and lighting effects. In this case, instead of using a single cylindrical image, slit cameras are rotated along planar concentric circles. A series of concentric manifold mosaics are created by composing the slit images acquired by each camera along their circular paths. Unlike light field and lumigraph where cameras are placed on a two-dimensional grid, the concentric mosaics representation reduces the amount of data by capturing a sequence of images along a circle path [14, 15]. • Lumigraph It is similar to light field rendering, but it applies approximated geometry to compensate for non-uniform sampling, in order to improve rendering performance [16]. • Transfer methods They are characterized by the use of a relatively small number of images with the application of geometric constraints (either recovered at some stage or known a priori) to reproject image pixels appropriately at a given virtual camera viewpoint [Laveau and Faugeras [17, 18]. • Relief texture To improve the rendering speed of 3D warping, the warping process is factored into a relatively simple pre-warping step and a traditional texture mapping step [19]. • Image-based objects They provide a compact image-based representation for 3D objects that can be rendered in occlusion-compatible order. An image-based object is constructed by acquiring multiple views of the object, registering and resampling them from every center of projection onto the faces of a parallelepiped. The use of a parallelepiped allows such a representation to be decomposed into parameterized planar regions for which a warper can be efficiently implemented [20]. • Image-based visual hulls It is based on efficient computation and shading visual hulls from silhouette image data. The algorithm takes advantage of epipolar geometry and incremental computation to achieve a constant rendering cost per rendered pixel [21]. • 3D Warping With available depth information for every point in one or more images, 3D warping techniques can be used to render from any nearby point of view by projecting the pixels of the original image to their proper 3D locations and re-projecting them onto the new picture [22] • Layered depth images To deal with the disocclusion artifacts in 3D warping, Layered Depth Image is proposed to store not only what is visible in the input image, but also what is behind the visible surface. Each pixel in the input image contains a list of depth and color values where the ray from the pixel intersects with the scene [23]. • View-dependent texture maps View-dependent texture mapping is used to render novel views, by warping and compositing several input images. A three-step view-dependent texture mapping method is considered to further reduce the computational cost and provide smoother blending. This method employs visibility preprocessing, polygon-view maps, and projective texture mapping [24, 25]. • Surface light field It is a function that assigns a color to each ray originating on a surface. Surface light fields are well suited to constructing virtual images of shiny objects under complex lighting conditions [26]. • Light field mapping This method is a representation and interactive visualization of surface light fields, by partitioning the radiance data over elementary surface primitives and by approximating each partitioned data by a small set of lower-dimensional discrete functions. The rendering algorithm decodes directly from this compact representation at interactive frame rates [27]. For an exhaustive overview of the currently available methods and algorithms in this topic, see the following surveys [1, 28]


[1] Oliveira, Manuel M. "Image-based modeling and rendering techniques: A survey." RITA 9.2 (2002): 37-66.

[2] Apollo Software. http://www.photo3d.com/eindex.html (July 2002).

[3] PhotoModeler. http://www.PhotoModeler.com (July 2002).

[4] Roberto Cipolla, Duncan Robertson and Edmond Boyer. Photobuilder - 3D Models of

Architectural Scenes from Uncalibrated Images. Conference on Multimedia Computing and Systems, June 1999. pp.25-31.

[5] Horry, Youichi, Ken Ichi Anjyo, and Kiyoshi Arai. Tour into the picture: Using a spidery mesh interface to make animation from a single image. Proceedings of SIGGRAPH 1997. pp. 225-232.

[6] Xiao, Jianxiong, et al. "Image-based façade modeling." ACM transactions on graphics (TOG). Vol. 27. No. 5. ACM, 2008.

[7] Seitz, Steven M., and Charles R. Dyer. "Photorealistic scene reconstruction by voxel coloring." Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on. IEEE, 1997.

[8] Heyden, Anders, and Marc Pollefeys. "Multiple view geometry." Emerging Topics in Computer Vision (2005).

[9] Levoy, Marc, and Pat Hanrahan. "Light field rendering." Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM, 1996.

[10] Aliaga, Daniel G., and Ingrid Carlbom. "Plenoptic stitching: a scalable method for reconstructing 3D interactive walk throughs." Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 2001.

[11] Roundshot 220VR. http://www.roundshot.com/cameras220VR.html (July 2002).

[12] Globuscope Panoramic Camera. http://www.everent.com/globus/ (July 2002).

[13] Chen, Shenchang Eric. QuickTime VR – An Image-Based Approach to Virtual

Environment Navigation. Proceedings of SIGGRAPH 1995. pp. 29-38.

[14] Peleg, Shmuel and Joshua Herman. Panoramic Mosaics by Manifold Projection. Proceedings of the Conference on Computer Vision and Pattern Recognition 1997. pp. 338-343.

[15] Shum, Heung-Yeung and Li-Wei He. Rendering with Concentric Mosaics. Proceedings of SIGGRAPH 1999, pp. 299-306.

[16] Gortler, Steven J., et al. "The lumigraph." Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM, 1996.

[17] S. Laveau and O. D. Faugeras. 3-d scene representation as a collection of images. In Twelfth International Conference on Pattern Recognition (ICPR’94), volume A, pages 689–691, Jerusalem, Israel, October 1994. IEEE Computer Society Press.

[18] O. Faugeras. Three-dimensional computer vision: A geometric viewpoint. MIT Press, Cambridge, Massachusetts, 1993.

[19] Oliveira, Manuel M., Gary Bishop and David McAllister. Relief Texture Mapping. Proceedings of SIGGRAPH 2000. pp. 359-368.

[20] Oliveira, Manuel M. and Gary Bishop. Image-Based Objects. Proceedings of 1999 ACM Symposium on Interactive 3D Graphics. pp. 191-198.

[21] Matusik, Wojciech, et al. "Image-based visual hulls." Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 2000.

[22] L. McMillan. An image-based approach to three-dimensional computer graphics. Technical report, Ph.D. Dissertation, UNC Computer Science TR97-013, 1999.

[23] J. Shade, S. Gortler, L.-W. He, and R. Szeliski. Layered depth images. In Computer Graphics (SIGGRAPH’98) Proceedings, pages 231–242, Orlando, July 1998. ACM SIGGRAPH.

[24] P. Debevec, Y. Yu, and G. Borshukov. Efficient view-dependent image-based rendering with projective texture-mapping. In Proc. 9th Eurographics Workshop on Rendering, pages 105–116, 1998.

[25] P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. Computer Graphics (SIGGRAPH’96), pages 11–20, August 1996.

[26] Wood, Daniel N., et al. "Surface light fields for 3D photography." Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 2000.

[27] Chen, Wei-Chao, et al. "Light field mapping: efficient representation and hardware rendering of surface light fields." ACM Transactions on Graphics (TOG) 21.3 (2002): 447-456.

[28] Shum, Harry, and Sing B. Kang. "Review of image-based rendering techniques." Visual Communications and Image Processing 2000. International Society for Optics and Photonics, 2000.

  • Quan, Long. Image-based modeling. Springer Science & Business Media, 2010.
  • Shuai Li; Ce Zhu; Ming-Ting Sun (2018). "Hole Filling with Multiple Reference Views in DIBR View Synthesis". IEEE Transactions on Multimedia. doi:10.1109/TMM.2018.2791810.
  • Ce Zhu; Shuai Li (2016). "Depth Image Based View Synthesis: New Insights and Perspectives on Hole Generation and Filling". IEEE Transactions on Broadcasting. 62 (1): 82–93. doi:10.1109/TBC.2015.2475697.
  • Mansi Sharma; Santanu Chaudhury; Brejesh Lall; M.S. Venkatesh (2014). "A flexible architecture for multi-view 3DTV based on uncalibrated cameras". Journal of Visual Communication and Image Representation. 25 (4): 599–621. doi:10.1016/j.jvcir.2013.07.012.
  • Mansi Sharma; Santanu Chaudhury; Brejesh Lall (2014). Kinect-Variety Fusion: A Novel Hybrid Approach for Artifacts-Free 3DTV Content Generation. In 22nd International Conference on Pattern Recognition (ICPR), Stockholm, 2014,. doi:10.1109/ICPR.2014.395.
  • Mansi Sharma; Santanu Chaudhury; Brejesh Lall (2012). 3DTV view generation with virtual pan/tilt/zoom functionality. Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing, ACM New York, NY, USA. doi:10.1145/2425333.2425374.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.