Texture synthesis

Last updated

Texture synthesis is the process of algorithmically constructing a large digital image from a small digital sample image by taking advantage of its structural content. It is an object of research in computer graphics and is used in many fields, amongst others digital image editing, 3D computer graphics and post-production of films.

Contents

Texture synthesis can be used to fill in holes in images (as in inpainting), create large non-repetitive background images and expand small pictures. [1]

Contrast with procedural textures

Procedural textures are a related technique which may synthesise textures from scratch with no source material. By contrast, texture synthesis refers to techniques where some source image is being matched or extended.

Textures

"Texture" is an ambiguous word and in the context of texture synthesis may have one of the following meanings:

  1. In common speech, the word "texture" is used as a synonym for "surface structure". Texture has been described by five different properties in the psychology of perception: coarseness, contrast, directionality, line-likeness and roughness .
  2. In 3D computer graphics, a texture is a digital image applied to the surface of a three-dimensional model by texture mapping to give the model a more realistic appearance. Often, the image is a photograph of a "real" texture, such as wood grain.
  3. In image processing, every digital image composed of repeated elements is called a "texture."
A mix of photographs and generated images, illustrating the texture spectrum Texture spectrum.jpg
A mix of photographs and generated images, illustrating the texture spectrum

Texture can be arranged along a spectrum going from regular to stochastic, connected by a smooth transition: [2]

Goal

Texture synthesis algorithms are intended to create an output image that meets the following requirements:

Like most algorithms, texture synthesis should be efficient in computation time and in memory use.

Methods

The following methods and algorithms have been researched or developed for texture synthesis:

Tiling

The simplest way to generate a large image from a sample image is to tile it. This means multiple copies of the sample are simply copied and pasted side by side. The result is rarely satisfactory. Except in rare cases, there will be the seams in between the tiles and the image will be highly repetitive.

Stochastic texture synthesis

Stochastic texture synthesis methods produce an image by randomly choosing colour values for each pixel, only influenced by basic parameters like minimum brightness, average colour or maximum contrast. These algorithms perform well with stochastic textures only, otherwise they produce completely unsatisfactory results as they ignore any kind of structure within the sample image.

Single purpose structured texture synthesis

Algorithms of that family use a fixed procedure to create an output image, i. e. they are limited to a single kind of structured texture. Thus, these algorithms can both only be applied to structured textures and only to textures with a very similar structure. For example, a single purpose algorithm could produce high quality texture images of stonewalls; yet, it is very unlikely that the algorithm will produce any viable output if given a sample image that shows pebbles.

Chaos mosaic

This method, proposed by the Microsoft group for internet graphics, is a refined version of tiling and performs the following three steps:

  1. The output image is filled completely by tiling. The result is a repetitive image with visible seams.
  2. Randomly selected parts of random size of the sample are copied and pasted randomly onto the output image. The result is a rather non-repetitive image with visible seams.
  3. The output image is filtered to smooth edges.

The result is an acceptable texture image, which is not too repetitive and does not contain too many artifacts. Still, this method is unsatisfactory because the smoothing in step 3 makes the output image look blurred.

Pixel-based texture synthesis

These methods, using Markov fields, [3] non-parametric sampling, [4] tree-structured vector quantization [5] and image analogies [6] are some of the simplest and most successful general texture synthesis algorithms. They typically synthesize a texture in scan-line order by finding and copying pixels with the most similar local neighborhood as the synthetic texture. These methods are very useful for image completion. They can be constrained, as in image analogies, to perform many interesting tasks. They are typically accelerated with some form of Approximate Nearest Neighbor method since the exhaustive search for the best pixel is somewhat slow. The synthesis can also be performed in multiresolution, such as through use of a noncausal nonparametric multiscale Markov random field. [7]

Image quilting Imagequilting.gif
Image quilting

Patch-based texture synthesis

Patch-based texture synthesis creates a new texture by copying and stitching together textures at various offsets, similar to the use of the clone tool to manually synthesize a texture. Image quilting [8] and graphcut textures [9] are the best known patch-based texture synthesis algorithms. These algorithms tend to be more effective and faster than pixel-based texture synthesis methods.

Deep Learning and Neural Network Approaches

More recently, deep learning methods were shown to be a powerful, fast and data-driven, parametric approach to texture synthesis. The work of Leon Gatys [10] is a milestone: he and his co-authors showed that filters from a discriminatively trained deep neural network can be used as effective parametric image descriptors, leading to a novel texture synthesis method.

Another recent development is the use of generative models for texture synthesis. The Spatial GAN [11] method showed for the first time the use of fully unsupervised GANs for texture synthesis. In a subsequent work, [12] the method was extended further—PSGAN can learn both periodic and non-periodic images in an unsupervised way from single images or large datasets of images. In addition, flexible sampling in the noise space allows to create novel textures of potentially infinite output size, and smoothly transition between them. This makes PSGAN unique with respect to the types of images a texture synthesis method can create.

Implementations

Some texture synthesis implementations exist as plug-ins for the free image editor Gimp:

A pixel-based texture synthesis implementation:

Patch-based texture synthesis:

Deep Generative Texture Synthesis with PSGAN, implemented in Python with Lasagne + Theano:

Literature

Several of the earliest and most referenced papers in this field include:

although there was also earlier work on the subject, such as

(The latter algorithm has some similarities to the Chaos Mosaic approach).

The non-parametric sampling approach of Efros-Leung is the first approach that can easily synthesize most types of texture, and it has inspired literally hundreds of follow-on papers in computer graphics. Since then, the field of texture synthesis has rapidly expanded with the introduction of 3D graphics accelerator cards for personal computers. It turns out, however, that Scott Draves first published the patch-based version of this technique along with GPL code in 1993 according to Efros.

See also

Related Research Articles

<span class="mw-page-title-main">Rendering (computer graphics)</span> Process of generating an image from a model

Rendering or image synthesis is the process of generating a photorealistic or non-photorealistic image from a 2D or 3D model by means of a computer program. The resulting image is referred to as the render. Multiple models can be defined in a scene file containing objects in a strictly defined language or data structure. The scene file contains geometry, viewpoint, texture, lighting, and shading information describing the virtual scene. The data contained in the scene file is then passed to a rendering program to be processed and output to a digital image or raster graphics image file. The term "rendering" is analogous to the concept of an artist's impression of a scene. The term "rendering" is also used to describe the process of calculating effects in a video editing program to produce the final video output.

<span class="mw-page-title-main">Wang tile</span> Square tiles with a color on each edge

Wang tiles, first proposed by mathematician, logician, and philosopher Hao Wang in 1961, are a class of formal systems. They are modelled visually by square tiles with a color on each side. A set of such tiles is selected, and copies of the tiles are arranged side by side with matching colors, without rotating or reflecting them.

In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts (aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphics, digital audio, and many other applications.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

<span class="mw-page-title-main">Distance transform</span>

A distance transform, also known as distance map or distance field, is a derived representation of a digital image. The choice of the term depends on the point of view on the object in question: whether the initial image is transformed into another representation, or it is simply endowed with an additional map or field.

<span class="mw-page-title-main">Anisotropic filtering</span> Method of enhancing the image quality of textures on surfaces of computer graphics

In 3D computer graphics, anisotropic filtering is a method of enhancing the image quality of textures on surfaces of computer graphics that are at oblique viewing angles with respect to the camera where the projection of the texture appears to be non-orthogonal.

<span class="mw-page-title-main">Volume rendering</span> Representing a 3D-modeled object or dataset as a 2D projection

In scientific visualization and computer graphics, volume rendering is a set of techniques used to display a 2D projection of a 3D discretely sampled data set, typically a 3D scalar field.

In computer graphics, texture filtering or texture smoothing is the method used to determine the texture color for a texture mapped pixel, using the colors of nearby texels. There are two main categories of texture filtering, magnification filtering and minification filtering. Depending on the situation texture filtering is either a type of reconstruction filter where sparse data is interpolated to fill gaps (magnification), or a type of anti-aliasing (AA), where texture samples exist at a higher frequency than required for the sample frequency needed for texture fill (minification). Put simply, filtering describes how a texture is applied at many different shapes, size, angles and scales. Depending on the chosen filter algorithm the result will show varying degrees of blurriness, detail, spatial aliasing, temporal aliasing and blocking. Depending on the circumstances filtering can be performed in software or in hardware for real time or GPU accelerated rendering or in a mixture of both. For most common interactive graphical applications modern texture filtering is performed by dedicated hardware which optimizes memory access through memory cacheing and pre-fetch and implements a selection of algorithms available to the user and developer.

<span class="mw-page-title-main">Reyes rendering</span> Computer software architecture in 3D computer graphics

Reyes rendering is a computer software architecture used in 3D computer graphics to render photo-realistic images. It was developed in the mid-1980s by Loren Carpenter and Robert L. Cook at Lucasfilm's Computer Graphics Research Group, which is now Pixar. It was first used in 1982 to render images for the Genesis effect sequence in the movie Star Trek II: The Wrath of Khan. Pixar's RenderMan was one implementation of the Reyes algorithm, until its removal in 2016. According to the original paper describing the algorithm, the Reyes image rendering system is "An architecture for fast high-quality rendering of complex images." Reyes was proposed as a collection of algorithms and data processing systems. However, the terms "algorithm" and "architecture" have come to be used synonymously in this context and are used interchangeably in this article.

<span class="mw-page-title-main">Path tracing</span> Computer graphics method

Path-tracing is a computer graphics Monte Carlo method of rendering images of three-dimensional scenes such that the global illumination is faithful to reality. Fundamentally, the algorithm is integrating over all the illuminance arriving to a single point on the surface of an object. This illuminance is then reduced by a surface reflectance function (BRDF) to determine how much of it will go towards the viewpoint camera. This integration procedure is repeated for every pixel in the output image. When combined with physically accurate models of surfaces, accurate models of real light sources, and optically correct cameras, path tracing can produce still images that are indistinguishable from photographs.

<span class="mw-page-title-main">Image scaling</span> Changing the resolution of a digital image

In computer graphics and digital imaging, imagescaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

<span class="mw-page-title-main">Diamond-square algorithm</span> Method for generating heightmaps for computer graphics

The diamond-square algorithm is a method for generating heightmaps for computer graphics. It is a slightly better algorithm than the three-dimensional implementation of the midpoint displacement algorithm, which produces two-dimensional landscapes. It is also known as the random midpoint displacement fractal, the cloud fractal or the plasma fractal, because of the plasma effect produced when applied.

<span class="mw-page-title-main">Supersampling</span> Spatial anti-aliasing method

Supersampling or supersampling anti-aliasing (SSAA) is a spatial anti-aliasing method, i.e. a method used to remove aliasing from images rendered in computer games or other computer programs that generate imagery. Aliasing occurs because unlike real-world objects, which have continuous smooth curves and lines, a computer screen shows the viewer a large number of small squares. These pixels all have the same size, and each one has a single color. A line can only be shown as a collection of pixels, and therefore appears jagged unless it is perfectly horizontal or vertical. The aim of supersampling is to reduce this effect. Color samples are taken at several instances inside the pixel, and an average color value is calculated. This is achieved by rendering the image at a much higher resolution than the one being displayed, then shrinking it to the desired size, using the extra pixels for calculation. The result is a downsampled image with smoother transitions from one line of pixels to another along the edges of objects.

Face hallucination refers to any superresolution technique which applies specifically to faces. It comprises techniques which take noisy or low-resolution facial images, and convert them into high-resolution images using knowledge about typical facial features. It can be applied in facial recognition systems for identifying faces faster and more effectively. Due to the potential applications in facial recognition systems, face hallucination has become an active area of research.

<span class="mw-page-title-main">Seam carving</span>

Seam carving is an algorithm for content-aware image resizing, developed by Shai Avidan, of Mitsubishi Electric Research Laboratories (MERL), and Ariel Shamir, of the Interdisciplinary Center and MERL. It functions by establishing a number of seams in an image and automatically removes seams to reduce image size or inserts seams to extend it. Seam carving also allows manually defining areas in which pixels may not be modified, and features the ability to remove whole objects from photographs.

Screen space ambient occlusion (SSAO) is a computer graphics technique for efficiently approximating the ambient occlusion effect in real time. It was developed by Vladimir Kajalin while working at Crytek and was used for the first time in 2007 by the video game Crysis, also developed by Crytek.

<span class="mw-page-title-main">Procedural texture</span> Computer graphics textures that are generated procedurally

In computer graphics, a procedural texture is a texture created using a mathematical description rather than directly stored data. The advantage of this approach is low storage cost, unlimited texture resolution and easy texture mapping. These kinds of textures are often used to model surface or volumetric representations of natural elements such as wood, marble, granite, metal, stone, and others.

<span class="mw-page-title-main">Line integral convolution</span> Method for visualizing vector fields

In scientific visualization, line integral convolution (LIC) is a method to visualize a vector field, such as fluid motion.

References

  1. "SIGGRAPH 2007 course on Example-based Texture Synthesis"
  2. "Near-regular Texture Analysis and Manipulation". Yanxi Liu, Wen-Chieh Lin, and James Hays. SIGGRAPH 2004
  3. "Texture synthesis via a noncausal nonparametric multiscale Markov random field." Paget and Longstaff, IEEE Trans. on Image Processing, 1998
  4. "Texture Synthesis by Non-parametric Sampling." Efros and Leung, ICCV, 1999
  5. "Fast Texture Synthesis using Tree-structured Vector Quantization" Wei and Levoy SIGGRAPH 2000
  6. "Image Analogies" Hertzmann et al. SIGGRAPH 2001.
  7. "Texture synthesis via a noncausal nonparametric multiscale Markov random field." Paget and Longstaff, IEEE Trans. on Image Processing, 1998
  8. "Image Quilting." Efros and Freeman. SIGGRAPH 2001
  9. "Graphcut Textures: Image and Video Synthesis Using Graph Cuts." Kwatra et al. SIGGRAPH 2003
  10. Gatys, Leon A.; Ecker, Alexander S.; Bethge, Matthias (2015-05-27). "Texture Synthesis Using Convolutional Neural Networks". arXiv: 1505.07376 [cs.CV].
  11. Jetchev, Nikolay; Bergmann, Urs; Vollgraf, Roland (2016-11-24). "Texture Synthesis with Spatial Generative Adversarial Networks". arXiv: 1611.08207 [cs.CV].
  12. Bergmann, Urs; Jetchev, Nikolay; Vollgraf, Roland (2017-05-18). "Learning Texture Manifolds with the Periodic Spatial GAN". arXiv: 1705.06566 [cs.CV].