Image scaling

Last updated

An image scaled with nearest-neighbor scaling (left) and 2xSaI scaling (right) 2xsai example.svg
An image scaled with nearest-neighbor scaling (left) and 2×SaI scaling (right)

In computer graphics and digital imaging, imagescaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

Contents

When scaling a vector graphic image, the graphic primitives that make up the image can be scaled using geometric transformations with no loss of image quality. When scaling a raster graphics image, a new image with a higher or lower number of pixels must be generated. In the case of decreasing the pixel number (scaling down), this usually results in a visible quality loss. From the standpoint of digital signal processing, the scaling of raster graphics is a two-dimensional example of sample-rate conversion, the conversion of a discrete signal from a sampling rate (in this case, the local sampling rate) to another.

Mathematical

Image scaling can be interpreted as a form of image resampling or image reconstruction from the view of the Nyquist sampling theorem. According to the theorem, downsampling to a smaller image from a higher-resolution original can only be carried out after applying a suitable 2D anti-aliasing filter to prevent aliasing artifacts. The image is reduced to the information that can be carried by the smaller image.

In the case of up sampling, a reconstruction filter takes the place of the anti-aliasing filter.

A more sophisticated approach to upscaling treats the problem as an inverse problem, solving the question of generating a plausible image that, when scaled down, would look like the input image. A variety of techniques have been applied for this, including optimization techniques with regularization terms and the use of machine learning from examples.

Algorithms

An image size can be changed in several ways.

Nearest-neighbor interpolation

One of the simpler ways of increasing image size is nearest-neighbor interpolation, replacing every pixel with the nearest pixel in the output; for upscaling, this means multiple pixels of the same color will be present. This can preserve sharp details in pixel art but also introduce jaggedness in previously smooth images. 'Nearest' in nearest-neighbor does not have to be the mathematical nearest. One common implementation is to always round toward zero. Rounding this way produces fewer artifacts and is faster to calculate.[ citation needed ]

Bilinear and bicubic algorithms

Bilinear interpolation works by interpolating pixel color values, introducing a continuous transition into the output even where the original material has discrete transitions. Although this is desirable for continuous-tone images, this algorithm reduces contrast (sharp edges) in a way that may be undesirable for line art. Bicubic interpolation yields substantially better results, with an increase in computational cost.[ citation needed ]

Sinc and Lanczos resampling

Sinc resampling, in theory, provides the best possible reconstruction for a perfectly bandlimited signal. In practice, the assumptions behind sinc resampling are not completely met by real-world digital images. Lanczos resampling, an approximation to the sinc method, yields better results. Bicubic interpolation can be regarded as a computationally efficient approximation to Lanczos resampling.[ citation needed ]

Box sampling

One weakness of bilinear, bicubic, and related algorithms is that they sample a specific number of pixels. When downscaling below a certain threshold, such as more than twice for all bi-sampling algorithms, the algorithms will sample non-adjacent pixels, which results in both losing data and rough results.[ citation needed ]

The trivial solution to this issue is box sampling, which is to consider the target pixel a box on the original image and sample all pixels inside the box. This ensures that all input pixels contribute to the output. The major weakness of this algorithm is that it is hard to optimize.[ citation needed ]

Mipmap

Another solution to the downscale problem of bi-sampling scaling is mipmaps. A mipmap is a prescaled set of downscaled copies. When downscaling, the nearest larger mipmap is used as the origin to ensure no scaling below the useful threshold of bilinear scaling. This algorithm is fast and easy to optimize. It is standard in many frameworks, such as OpenGL. The cost is using more image memory, exactly one-third more in the standard implementation.

Fourier-transform methods

Simple interpolation based on the Fourier transform pads the frequency domain with zero components (a smooth window-based approach would reduce the ringing). Besides the good conservation (or recovery) of details, notable are the ringing and the circular bleeding of content from the left border to the right border (and the other way around).

Edge-directed interpolation

Edge-directed interpolation algorithms aim to preserve edges in the image after scaling, unlike other algorithms, which can introduce staircase artifacts.

Examples of algorithms for this task include New Edge-Directed Interpolation (NEDI), [1] [2] Edge-Guided Image Interpolation (EGGI), [3] Iterative Curvature-Based Interpolation (ICBI),[ citation needed ] and Directional Cubic Convolution Interpolation (DCCI). [4] A 2013 analysis found that DCCI had the best scores in PSNR and SSIM on a series of test images. [5]

hqx

For magnifying computer graphics with low resolution and/or few colors (usually from 2 to 256 colors), better results can be achieved by hqx or other pixel-art scaling algorithms. These produce sharp edges and maintain a high level of detail.

Vectorization

Vector extraction, or vectorization, offers another approach. Vectorization first creates a resolution-independent vector representation of the graphic to be scaled. Then the resolution-independent version is rendered as a raster image at the desired resolution. This technique is used by Adobe Illustrator, Live Trace, and Inkscape. [6] Scalable Vector Graphics are well suited to simple geometric images, while photographs do not fare well with vectorization due to their complexity.

Deep convolutional neural networks

This method uses machine learning for more detailed images, such as photographs and complex artwork. Programs that use this method include waifu2x, Imglarger and Neural Enhance.

AI-driven software such as the MyHeritage Photo Enhancer allows detail and sharpness to be added to historical photographs, where it is not present in the original.

Applications

General

Image scaling is used in, among other applications, web browsers, [7] image editors, image and file viewers, software magnifiers, digital zoom, the process of generating thumbnail images, and when outputting images through screens or printers.

Video

This application is the magnification of images for home theaters for HDTV-ready output devices from PAL-Resolution content, for example, from a DVD player. Upscaling is performed in real time, and the output signal is not saved.

Pixel-art scaling

As pixel-art graphics are usually low-resolution, they rely on careful placement of individual pixels, often with a limited palette of colors. This results in graphics that rely on stylized visual cues to define complex shapes with little resolution, down to individual pixels. This makes scaling pixel art a particularly difficult problem.

Specialized algorithms [8] were developed to handle pixel-art graphics, as the traditional scaling algorithms do not take perceptual cues into account.

Since a typical application is to improve the appearance of fourth-generation and earlier video games on arcade and console emulators, many are designed to run in real time for small input images at 60 frames per second.

On fast hardware, these algorithms are suitable for gaming and other real-time image processing. These algorithms provide sharp, crisp graphics, while minimizing blur. Scaling art algorithms have been implemented in a wide range of emulators such as HqMAME and DOSBox, as well as 2D game engines and game engine recreations such as ScummVM. They gained recognition with gamers, for whom these technologies encouraged a revival of 1980s and 1990s gaming experiences.[ citation needed ]

Such filters are currently used in commercial emulators on Xbox Live, Virtual Console, and PSN to allow classic low-resolution games to be more visually appealing on modern HD displays. Recently released games that incorporate these filters include Sonic's Ultimate Genesis Collection , Castlevania: The Dracula X Chronicles , Castlevania: Symphony of the Night , and Akumajō Dracula X Chi no Rondo .

Real-time scaling

A number of companies have developed techniques to upscale video frames in real-time, such as when they are drawn on screen in a video game. Nvidia's deep learning super sampling (DLSS) uses deep learning to upsample lower-resolution images to a higher resolution for display on higher-resolution computer monitors. [9] AMD's FidelityFX Super Resolution 1.0 (FSR) does not employ machine learning, instead using traditional hand-written algorithms to achieve spatial upscaling on traditional shading units. FSR 2.0 utilises temporal upscaling, again with a hand-tuned algorithm. FSR standardized presets are not enforced, and some titles such as Dota 2 offer resolution sliders. [10] Other technologies include Intel XeSS and Nvidia Image Scaler (NIS). [11] [12]

See also

Related Research Articles

In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts (aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphics, digital audio, and many other applications.

In computer graphics, mipmaps or pyramids are pre-calculated, optimized sequences of images, each of which is a progressively lower resolution representation of the previous. The height and width of each image, or level, in the mipmap is a factor of two smaller than the previous level. Mipmaps do not have to be square. They are intended to increase rendering speed and reduce aliasing artifacts. A high-resolution mipmap image is used for high-density samples, such as for objects close to the camera; lower-resolution images are used as the object appears farther away. This is a more efficient way of downfiltering (minifying) a texture than sampling all texels in the original texture that would contribute to a screen pixel; it is faster to take a constant number of samples from the appropriately downfiltered textures. Mipmaps are widely used in 3D computer games, flight simulators, other 3D imaging systems for texture filtering, and 2D and 3D GIS software. Their use is known as mipmapping. The letters MIP in the name are an acronym of the Latin phrase multum in parvo, meaning "much in little".

<span class="mw-page-title-main">Bilinear interpolation</span> Method of interpolating functions on a 2D grid

In mathematics, bilinear interpolation is a method for interpolating functions of two variables using repeated linear interpolation. It is usually applied to functions sampled on a 2D rectilinear grid, though it can be generalized to functions defined on the vertices of arbitrary convex quadrilaterals.

In computer graphics, texture filtering or texture smoothing is the method used to determine the texture color for a texture mapped pixel, using the colors of nearby texels.

Trilinear filtering is an extension of the bilinear texture filtering method, which also performs linear interpolation between mipmaps.

hqx is a set of 3 image upscaling algorithms developed by Maxim Stepin. The algorithms are hq2x, hq3x, and hq4x, which magnify by a factor of 2, 3, and 4 respectively. It was initially created in 2003 for the Super NES emulator ZSNES, and is used in emulators such as Nestopia, FCEUX, higan, and Snes9x.

<span class="mw-page-title-main">Pixel-art scaling algorithms</span> Upscaling filters for pixel art graphics

Pixel art scaling algorithms are graphical filters that enhance hand-drawn 2D pixel art graphics. The re-scaling of pixel art is a specialist sub-field of image rescaling.

In a mixed-signal system, a reconstruction filter, sometimes called an anti-imaging filter, is used to construct a smooth analog signal from a digital input, as in the case of a digital to analog converter (DAC) or other sampled data output device.

<span class="mw-page-title-main">Lanczos resampling</span> Application of a mathematical formula

Lanczos filtering and Lanczos resampling are two applications of a mathematical formula. It can be used as a low-pass filter or used to smoothly interpolate the value of a digital signal between its samples. In the latter case, it maps each sample of the given signal to a translated and scaled copy of the Lanczos kernel, which is a sinc function windowed by the central lobe of a second, longer, sinc function. The sum of these translated and scaled kernels is then evaluated at the desired points.

Demosaicing, also known as color reconstruction, is a digital image processing algorithm used to reconstruct a full color image from the incomplete color samples output from an image sensor overlaid with a color filter array (CFA) such as a Bayer filter. It is also known as CFA interpolation or debayering.

<span class="mw-page-title-main">Acutance</span> Perception of image sharpness which is unrelated to actual resolution

In photography, acutance describes a subjective perception of sharpness that is related to the edge contrast of an image. Acutance is related to the amplitude of the derivative of brightness with respect to space. Due to the nature of the human visual system, an image with higher acutance appears sharper even though an increase in acutance does not increase real resolution.

Sample-rate conversion, sampling-frequency conversion or resampling is the process of changing the sampling rate or sampling frequency of a discrete signal to obtain a new discrete representation of the underlying continuous signal. Application areas include image scaling and audio/visual systems, where different sampling rates may be used for engineering, economic, or historical reasons.

<span class="mw-page-title-main">Supersampling</span> Spatial anti-aliasing method

Supersampling or supersampling anti-aliasing (SSAA) is a spatial anti-aliasing method, i.e. a method used to remove aliasing from images rendered in computer games or other computer programs that generate imagery. Aliasing occurs because unlike real-world objects, which have continuous smooth curves and lines, a computer screen shows the viewer a large number of small squares. These pixels all have the same size, and each one has a single color. A line can only be shown as a collection of pixels, and therefore appears jagged unless it is perfectly horizontal or vertical. The aim of supersampling is to reduce this effect. Color samples are taken at several instances inside the pixel, and an average color value is calculated. This is achieved by rendering the image at a much higher resolution than the one being displayed, then shrinking it to the desired size, using the extra pixels for calculation. The result is a downsampled image with smoother transitions from one line of pixels to another along the edges of objects. The number of samples determines the quality of the output.

The term post-processing is used in the video/film business for quality-improvement image processing methods used in video playback devices, such as stand-alone DVD-Video players; video playing software; and transcoding software. It is also commonly used in real-time 3D rendering to add additional effects.

In image processing, stairstep interpolation is a general method for interpolating the pixels after enlarging an image. The key idea is to interpolate multiple times in small increments using any interpolation algorithm that is better than nearest-neighbor interpolation, such as bilinear interpolation, and bicubic interpolation. A common scenario is to interpolate an image by using a bicubic interpolation which increases the image size by no more than 10% at a time until the desired size is reached.

Directional Cubic Convolution Interpolation (DCCI) is an edge-directed image scaling algorithm created by Dengwen Zhou and Xiaoliu Shen.

<span class="mw-page-title-main">GPUOpen</span> Middleware software suite

GPUOpen is a middleware software suite originally developed by AMD's Radeon Technologies Group that offers advanced visual effects for computer games. It was released in 2016. GPUOpen serves as an alternative to, and a direct competitor of Nvidia GameWorks. GPUOpen is similar to GameWorks in that it encompasses several different graphics technologies as its main components that were previously independent and separate from one another. However, GPUOpen is entirely open source software, unlike GameWorks which is proprietary and closed.

This is a glossary of terms relating to computer graphics.

Deep learning super sampling (DLSS) is a family of real-time deep learning image enhancement and upscaling technologies developed by Nvidia that are exclusive to its RTX line of graphics cards, and available in a number of video games. The goal of these technologies is to allow the majority of the graphics pipeline to run at a lower resolution for increased performance, and then infer a higher resolution image from this that approximates the same level of detail as if the image had been rendered at this higher resolution. This allows for higher graphical settings and/or frame rates for a given output resolution, depending on user preference.

Deep learning anti-aliasing (DLAA) is a form of spatial anti-aliasing created by Nvidia. DLAA depends on and requires Tensor Cores available in Nvidia RTX cards.

References

  1. "Edge-Directed Interpolation" . Retrieved 19 February 2016.
  2. Xin Li; Michael T. Orchard. "NEW EDGE DIRECTED INTERPOLATION" (PDF). 2000 IEEE International Conference on Image Processing: 311. Archived from the original (PDF) on 14 February 2016.
  3. Zhang, D.; Xiaolin Wu (2006). "An Edge-Guided Image Interpolation Algorithm via Directional Filtering and Data Fusion". IEEE Transactions on Image Processing. 15 (8): 2226–38. Bibcode:2006ITIP...15.2226Z. doi:10.1109/TIP.2006.877407. PMID   16900678. S2CID   9760560.
  4. Dengwen Zhou; Xiaoliu Shen. "Image Zooming Using Directional Cubic Convolution Interpolation" . Retrieved 13 September 2015.
  5. Shaode Yu; Rongmao Li; Rui Zhang; Mou An; Shibin Wu; Yaoqin Xie (2013). "Performance evaluation of edge-directed interpolation methods for noise-free images". arXiv: 1303.6455 [cs.CV].
  6. Johannes Kopf and Dani Lischinski (2011). "Depixelizing Pixel Art". ACM Transactions on Graphics. 30 (4): 99:1–99:8. doi:10.1145/2010324.1964994. Archived from the original on 1 September 2015. Retrieved 24 October 2012.
  7. Analysis of image scaling algorithms used by popular web browsers
  8. "Pixel Scalers" . Retrieved 19 February 2016.
  9. "NVIDIA DLSS: Your Questions, Answered". www.nvidia.com. Archived from the original on 5 October 2021. Retrieved 13 October 2021.
  10. "Valve's Dota 2 Adds AMD FidelityFX Super Resolution - Phoronix". www.phoronix.com. Archived from the original on 21 July 2021. Retrieved 13 October 2021.
  11. Gartenberg, Chaim (19 August 2021). "Intel shows off its answer to Nvidia's DLSS, coming to Arc GPUs in 2022". The Verge. Archived from the original on 19 August 2021. Retrieved 13 October 2021.
  12. "What Is Nvidia Image Scaling? Upscaling Tech, Explained". Digital Trends. 16 November 2021. Retrieved 3 December 2021.