- Image enlarged 3× with the nearest-neighbor interpolation
- Image enlarged by 3× with hq3x algorithm
Pixel art scaling algorithms are graphical filters that attempt to enhance the appearance of hand-drawn 2D pixel art graphics. These algorithms are a form of automatic image enhancement. Pixel art scaling algorithms employ methods significantly different than the common methods of image rescaling, which have the goal of preserving the appearance of images.
As pixel art graphics are commonly used at very low resolutions, they employ careful coloring of individual pixels. This results in graphics that rely on a high amount of stylized visual cues to define complex shapes. Several specialized algorithms [1] have been developed to handle re-scaling of such graphics.
These specialized algorithms can improve the appearance of pixel-art graphics, but in doing so they introduce changes. Such changes may be undesirable, especially if the goal is to faithfully reproduce the original appearance.
Since a typical application of this technology is improving the appearance of fourth-generation and earlier video games on arcade and console emulators, many pixel art scaling algorithms are designed to run in real-time for sufficiently small input images at 60-frames per second. This places constraints on the type of programming techniques that can be used for this sort of real-time processing.[ citation needed ] Many work only on specific scale factors. 2× is the most common scale factor, while and 3×, 4×, 5×, and 6× exist but are less used.
The Mullard SAA5050 Teletext character generator chip (1980) used a primitive pixel scaling algorithm to generate higher-resolution characters on the screen from a lower-resolution representation from its internal ROM. Internally, each character shape was defined on a 5 × 9 pixel grid, which was then interpolated by smoothing diagonals to give a 10 × 18 pixel character, with a characteristically angular shape, surrounded to the top and the left by two pixels of blank space. The algorithm only works on monochrome source data, and assumes the source pixels will be logically true or false depending on whether they are 'on' or 'off'. Pixels 'outside the grid pattern' are assumed to be off. [2] [3] [4]
The algorithm works as follows:
A B C --\ 1 2 D E F --/ 3 4 1 = B | (A & E & !B & !D) 2 = B | (C & E & !B & !F) 3 = E | (!A & !E & B & D) 4 = E | (!C & !E & B & F)
Note that this algorithm, like the Eagle algorithm below, has a flaw: If a pattern of 4 pixels in a hollow diamond shape appears, the hollow will be obliterated by the expansion. The SAA5050's internal character ROM carefully avoids ever using this pattern.
The degenerate case:
* * * *
becomes:
** **** **** **
Eric's Pixel Expansion (EPX) is an algorithm developed by Eric Johnston at LucasArts around 1992, when porting the SCUMM engine games from the IBM PC (which ran at 320 × 200 × 256 colors) to the early color Macintosh computers, which ran at more or less double that resolution. [5] The algorithm works as follows, expanding P into 4 new pixels based on P's surroundings:
1=P; 2=P; 3=P; 4=P; IF C==A => 1=A IF A==B => 2=B IF D==C => 3=C IF B==D => 4=D IF of A, B, C, D, three or more are identical: 1=2=3=4=P
Later implementations of this same algorithm (as AdvMAME2× and Scale2×, developed around 2001) are slightly more efficient but functionally identical:
1=P; 2=P; 3=P; 4=P; IF C==A AND C!=D AND A!=B => 1=A IF A==B AND A!=C AND B!=D => 2=B IF D==C AND D!=B AND C!=A => 3=C IF B==D AND B!=A AND D!=C => 4=D
AdvMAME2× is available in DOSBox via the scaler=advmame2x
dosbox.conf option.
The AdvMAME4×/Scale4× algorithm is just EPX applied twice to get 4× resolution.
The AdvMAME3×/Scale3× algorithm (available in DOSBox via the scaler=advmame3x
dosbox.conf option) can be thought of as a generalization of EPX to the 3× case. The corner pixels are calculated identically to EPX.
1=E; 2=E; 3=E; 4=E; 5=E; 6=E; 7=E; 8=E; 9=E; IF D==B AND D!=H AND B!=F => 1=D IF (D==B AND D!=H AND B!=F AND E!=C) OR (B==F AND B!=D AND F!=H AND E!=A) => 2=B IF B==F AND B!=D AND F!=H => 3=F IF (H==D AND H!=F AND D!=B AND E!=A) OR (D==B AND D!=H AND B!=F AND E!=G) => 4=D 5=E IF (B==F AND B!=D AND F!=H AND E!=I) OR (F==H AND F!=B AND H!=D AND E!=C) => 6=F IF H==D AND H!=F AND D!=B => 7=D IF (F==H AND F!=B AND H!=D AND E!=G) OR (H==D AND H!=F AND D!=B AND E!=I) => 8=H IF F==H AND F!=B AND H!=D => 9=F
There is also a variant improved over Scale3× called ScaleFX, developed by Sp00kyFox, and a version combined with Reverse-AA called ScaleFX-Hybrid. [6] [7] [8]
Eagle works as follows: for every in pixel, we will generate 4 out pixels. First, set all 4 to the color of the pixel we are currently scaling (as nearest-neighbor). Next look at the three pixels above, to the left, and diagonally above left: if all three are the same color as each other, set the top left pixel of our output square to that color in preference to the nearest-neighbor color. Work similarly for all four pixels, and then move to the next one. [9]
Assume an input matrix of 3 × 3 pixels where the centermost pixel is the pixel to be scaled, and an output matrix of 2 × 2 pixels (i.e., the scaled pixel)
first: |Then . . . --\ CC |S T U --\ 1 2 . C . --/ CC |V C W --/ 3 4 . . . |X Y Z | IF V==S==T => 1=S | IF T==U==W => 2=U | IF V==X==Y => 3=X | IF W==Z==Y => 4=Z
Thus if we have a single black pixel on a white background it will vanish. This is a bug in the Eagle algorithm but is solved by other algorithms such as EPX, 2xSaI, and HQ2x.
2×SaI, short for 2× Scale and Interpolation engine, was inspired by Eagle. It was designed by Derek Liauw Kie Fa, also known as Kreed, primarily for use in console and computer emulators, and it has remained fairly popular in this niche. Many of the most popular emulators, including ZSNES and VisualBoyAdvance, offer this scaling algorithm as a feature. Several slightly different versions of the scaling algorithm are available, and these are often referred to as Super 2×SaI and Super Eagle.
The 2xSaI family works on a 4 × 4 matrix of pixels where the pixel marked A below is scaled:
I E F J G A B K --\ W X H C D L --/ Y Z M N O P
For 16-bit pixels, they use pixel masks which change based on whether the 16-bit pixel format is 565 or 555. The constants colorMask
, lowPixelMask
, qColorMask
, qLowPixelMask
, redBlueMask
, and greenMask
are 16-bit masks. The lower 8 bits are identical in either pixel format.
Two interpolation functions are described:
INTERPOLATE(uint32 A, UINT32 B). -- linear midpoint of A and B if (A == B) return A; return ( ((A & colorMask) >> 1) + ((B & colorMask) >> 1) + (A & B & lowPixelMask) ); Q_INTERPOLATE(uint32 A, uint32 B, uint32 C, uint32 D) -- bilinear interpolation; A, B, C, and D's average x = ((A & qColorMask) >> 2) + ((B & qColorMask) >> 2) + ((C & qColorMask) >> 2) + ((D & qColorMask) >> 2); y = (A & qLowPixelMask) + (B & qLowPixelMask) + (C & qLowPixelMask) + (D & qLowPixelMask); y = (y >> 2) & qLowPixelMask; return x + y;
The algorithm checks A, B, C, and D for a diagonal match such that A==D
and B!=C
, or the other way around, or if they are both diagonals or if there is no diagonal match. Within these, it checks for three or four identical pixels. Based on these conditions, the algorithm decides whether to use one of A, B, C, or D, or an interpolation among only these four, for each output pixel. The 2xSaI arbitrary scaler can enlarge any image to any resolution and uses bilinear filtering to interpolate pixels.
Since Kreed released [10] the source code under the GNU General Public License, it is freely available to anyone wishing to utilize it in a project released under that license. Developers wishing to use it in a non-GPL project would be required to rewrite the algorithm without using any of Kreed's existing code.
It is available in DosBox via scaler=2xsai
option.
Maxim Stepin's hq2x, hq3x, and hq4x are for scale factors of 2:1, 3:1, and 4:1 respectively. Each work by comparing the color value of each pixel to those of its eight immediate neighbors, marking the neighbors as close or distant, and using a pre-generated lookup table to find the proper proportion of input pixels' values for each of the 4, 9 or 16 corresponding output pixels. The hq3x family will perfectly smooth any diagonal line whose slope is ±0.5, ±1, or ±2 and which is not anti-aliased in the input; one with any other slope will alternate between two slopes in the output. It will also smooth very tight curves. Unlike 2xSaI, it anti-aliases the output. [11] [8]
hqnx was initially created for the Super NES emulator ZSNES. The author of bsnes has released a space-efficient implementation of hq2x to the public domain. [12] A port to shaders, which has comparable quality to the early versions of xBR, is available. [13] Before the port, a shader called "scalehq" has often been confused for hqx. [14]
There are 6 filters in this family: xBR , xBRZ, xBR-Hybrid, Super xBR, xBR+3D and Super xBR+3D.
xBR ("scale by rules"), created by Hyllian, works much the same way as HQx (based on pattern recognition) and would generate the same result as HQx when given the above pattern. [15] However, it goes further than HQx by using a 2-stage set of interpolation rules, which better handle more complex patterns such as anti-aliased lines and curves. Scaled background textures keep the sharp characteristics of the original image, rather than becoming blurred like HQx (often ScaleHQ in practice) tends to do. The newest xBR versions are multi-pass and can preserve small details better. There is also a version of xBR combined with Reverse-AA shader called xBR-Hybrid. [16] xBR+3D is a version with a 3D mask that only filters 2D elements.
xBRZ by Zenju is a modified version of xBR. It is implemented from scratch as a CPU-based filter in C++ . [17] It uses the same basic idea as xBR's pattern recognition and interpolation but with a different rule set designed to preserve fine image details as small as a few pixels. This makes it useful for scaling the details in faces, and in particular eyes. xBRZ is optimized for multi-core CPUs and 64-bit architectures and shows 40–60% better performance than HQx even when running on a single CPU core only.[ citation needed ] It supports scaling images with an alpha channel, and scaling by integer factors from 2× up to 6×.
Super xBR [18] [19] is an algorithm developed by Hylian in 2015. It uses some combinations of known linear filters along with xBR edge detection rules in a non-linear way. It works in two passes and can only scale an image by two (or multiples of two by reapplying it and also has an anti-ringing filter). Super xBR+3D is a version with a 3D mask that only filters 2D elements. There is also a Super xBR version rewritten in C/C++. [20] [8]
RotSprite is a scaling and rotation algorithm for sprites developed by Xenowhirl. It produces far fewer artifacts than nearest-neighbor rotation algorithms, and like EPX, it does not introduce new colors into the image (unlike most interpolation systems). [21]
The algorithm first scales the image to 8 times its original size with a modified Scale2× algorithm which treats similar (rather than identical) pixels as matches. It then (optionally) calculates what rotation offset to use by favoring sampled points that are not boundary pixels. Next, the rotated image is created with a nearest-neighbor scaling and rotation algorithm that simultaneously shrinks the big image back to its original size and rotates the image. Finally, overlooked single-pixel details are (optionally) restored if the corresponding pixel in the source image is different and the destination pixel has three identical neighbors. [22]
Fast RotSprite is a fast rotation algorithm for pixel art developed by Oleg Mekekechko for the Pixel Studio app. It is based on RotSprite but has better performance with slight quality loss. It can process larger images in real-time. Instead of the 8× upscale, Fast RotSprite uses a single 3× upscale. Then it simply rotates all pixels with rounding coordinates. Finally, it performs 3× downscale without introducing new colors. As all operations on each step are independent, they can be done in parallel to greatly increase performance.
The Kopf–Lischinski algorithm is a novel way to extract resolution-independent vector graphics from pixel art described in the 2011 paper "Depixelizing Pixel Art". [23] A Python implementation is available. [24]
The algorithm has been ported to GPUs and optimized for real-time rendering. The source code is available for this variant. [25]
This section's tone or style may not reflect the encyclopedic tone used on Wikipedia.(May 2016) |
Edge-directed interpolation (EDI) describes upscaling techniques that use statistical sampling to ensure the quality of an image when scaling it up. [26] [27] There were several earlier methods that involved detecting edges to generate blending weights for linear interpolation or classifying pixels according to their neighbor conditions and using different otherwise isotropic interpolation schemes based on the classification. Any given interpolation approach boils down to weighted averages of neighboring pixels. The goal is to find optimal weights. Bilinear interpolation sets all the weights to be equal. Higher-order interpolation methods like bicubic or sinc interpolation consider a larger number of neighbors than just the adjacent ones.
NEDI (New Edge-Directed Interpolation) computes local covariances in the original image and uses them to adapt the interpolation at high resolution. It is the prototypic filter of this family. [28]
EDIUpsizer [29] is a resampling filter that resizes an image by a factor of two both horizontally and vertically using NEDI (new edge-directed interpolation). [28] EDIUpsizer also uses a few modifications to basic NEDI to prevent a lot of the artifacts that NEDI creates in detailed areas. These include condition number testing and adaptive window size, [30] as well as capping constraints. All modifications and constraints to NEDI are optional (can be turned on and off) and are user-configurable. This filter is rather slow.
FastEDIUpsizer is a slimmed-down version of EDIUpsizer that is slightly more tuned for speed. It uses a constant 8 × 8 window size, only performs NEDI on the luma plane, and only uses either bicubic or bilinear interpolation as the fallback interpolation method.
Another edge-directed interpolation filter. Works by minimizing a cost function involving every pixel in a scan line. It is slow.
EEDI2 resizes an image by 2× in the vertical direction by copying the existing image to 2⋅y(n) and interpolating the missing field. It is intended for edge-directed interpolation for deinterlacing (i.e. not made for resizing a normal image, but can do that as well). EEDI2 can be used with both TDeint and TIVTC, see the discussion link for more info on how to do this. [31]
The SuperRes [32] shaders use a different scaling method which can be used in combination with NEDI (or any other scaling algorithm). The method is explained in detail by its creator Shiandow in a Doom9 forum post in 2014. [33] This method often gives better results than just using NEDI, and rival those of NNEDI3. These are now also available as an MPDN renderscript.
NNEDI is a family of intra-field deinterlacers which can also be used to enlarge images by powers of two. When being used as a deinterlacer, it takes in a frame, throws away one field, and then interpolates the missing pixels using only information from the kept field. There are so far three major generations of NNEDI.
NNEDI, the original version, works with YUY2 and YV12 input. [34] NNEDI2 added RGB24 support and a special function nnedi2_rpow2
for upscaling. NNEDI3 enhances NNEDI2 with a predictor neural network. Both the size of the network and the neighborhood it examines can be tweaked for a speed-quality tradeoff: [35]
This is a quality vs speed option; however, differences are usually small between the number of neurons for a specific resize factor, however the performance difference between the count of neurons becomes larger as you quadruple the image size. If you are only planning on doubling the resolution then you won't see massive differences between 16 and 256 neurons. There is still a noticeable difference between the highest and lowest options, but not orders of magnitude different. [36]
In computer graphics, alpha compositing or alpha blending is the process of combining one image with a background to create the appearance of partial or full transparency. It is often useful to render picture elements (pixels) in separate passes or layers and then combine the resulting 2D images into a single, final image called the composite. Compositing is used extensively in film when combining computer-rendered image elements with live footage. Alpha blending is also used in 2D computer graphics to put rasterized foreground elements over a background.
GNU Image Manipulation Program, commonly known by its acronym GIMP, is a free and open-source raster graphics editor used for image manipulation (retouching) and image editing, free-form drawing, transcoding between different image file formats, and more specialized tasks. It is extensible by means of plugins, and scriptable. It is not designed to be used for drawing, though some artists and creators have used it in this way.
The Nyquist–Shannon sampling theorem is an essential principle for digital signal processing linking the frequency range of a signal and the sample rate required to avoid a type of distortion called aliasing. The theorem states that the sample rate must be at least twice the bandwidth of the signal to avoid aliasing. In practice, it is used to select band-limiting filters to keep aliasing below an acceptable amount when an analog signal is sampled or when sample rates are changed within a digital signal processing function.
In computer graphics, rasterisation or rasterization is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image. The rasterized image may then be displayed on a computer display, video display or printer, or stored in a bitmap file format. Rasterization may refer to the technique of drawing 3D models, or to the conversion of 2D rendering primitives, such as polygons and line segments, into a rasterized format.
Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics ; third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.
In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts (aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphics, digital audio, and many other applications.
Texture mapping is a method for mapping a texture on a computer-generated graphic. "Texture" in this context can be high frequency detail, surface texture, or color.
The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a computational theory of edge detection explaining why the technique works.
Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree. Noise rejection is the ability of a circuit to isolate an undesired signal component from the desired signal component, as with common-mode rejection ratio.
In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.
A Bayer filter mosaic is a color filter array (CFA) for arranging RGB color filters on a square grid of photosensors. Its particular arrangement of color filters is used in most single-chip digital image sensors used in digital cameras, and camcorders to create a color image. The filter pattern is half green, one quarter red and one quarter blue, hence is also called BGGR, RGBG, GRBG, or RGGB.
hqx is a set of 3 image upscaling algorithms developed by Maxim Stepin. The algorithms are hq2x, hq3x, and hq4x, which magnify by a factor of 2, 3, and 4 respectively. It was initially created in 2003 for the Super NES emulator ZSNES, and is used in emulators such as Nestopia, F. CEUXSnes9x., and Snes9x.
Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used for quality control in manufacturing, navigation of mobile robots, or edge detection in images.
In computer graphics and digital imaging, imagescaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.
Demosaicing, also known as color reconstruction, is a digital image processing algorithm used to reconstruct a full color image from the incomplete color samples output from an image sensor overlaid with a color filter array (CFA) such as a Bayer filter. It is also known as CFA interpolation or debayering.
The term post-processing is used in the video and film industry for quality-improvement image processing methods used in video playback devices, such as stand-alone DVD-Video players; video playing software; and transcoding software. It is also commonly used in real-time 3D rendering to add additional effects.
In computer vision, speeded up robust features (SURF) is a patented local feature detector and descriptor. It can be used for tasks such as object recognition, image registration, classification, or 3D reconstruction. It is partly inspired by the scale-invariant feature transform (SIFT) descriptor. The standard version of SURF is several times faster than SIFT and claimed by its authors to be more robust against different image transformations than SIFT.
Non-local means is an algorithm in image processing for image denoising. Unlike "local mean" filters, which take the mean value of a group of pixels surrounding a target pixel to smooth the image, non-local means filtering takes a mean of all pixels in the image, weighted by how similar these pixels are to the target pixel. This results in much greater post-filtering clarity, and less loss of detail in the image compared with local mean algorithms.
In image processing, a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an image. Or more simply, when each pixel in the output image is a function of the nearby pixels in the input image, the kernel is that function.
This is a glossary of terms relating to computer graphics.
{{cite web}}
: CS1 maint: archived copy as title (link)