Han Xin code (汉信码 in Chinese, Chinese-sensible code) is two-dimensional (2D) matrix barcode symbology invented in 2007 [1] by Chinese company The Article Numbering Center of China [2] (中国物品编码中心 in Chinese) to break monopoly of QR code. As QR code, Han Xin code consists of black squares and white square spaces arranged in a square grid on a white background. It has four finder patterns and other markers which allow to recognize it with camera-based readers. Han Xin code contains Reed–Solomon error correction with ability to read corrupted images. At this time, it is issued as ISO/IEC 20830:2021. [3]
The main advantage (and invention requirement), comparable to QR code, is an embedded ability to natively encode Chinese characters instead of Japanese in QR code. Han Xin code in maximal 84 version (189×189 size) [4] allows to encode 7827 numeric characters, 4350 English text characters, 3261 bytes and 1044–2174 Chinese characters (it depends on Unicode region). Han Xin code encodes full ISO/IEC 646 Latin characters instead of restricted amount Latin characters which is supported by QR code. It makes Han Xin code more suitable for English text encoding or GS1 Application Identifiers [5] data encoding.
Additionally, Han Xin code can encode Unicode characters from other languages with special Unicode mode, [3] : 5.4.12 which has embedded lossless compression for UTF-8 characters set and Extended Channel Interpretation support. Han Xin code has special compactification mode for URI encoding and can reduce barcode size which encodes links to web pages.
Chinese company The Article Numbering Center of China (中国物品编码中心 in Chinese) during 10-th Five-year plans of China started research [6] of own QR code replacement to remove Japanese monopoly in 2D barcodes. In 2007, the new barcodes standard, at this time known as Han Xin code, published as GB/T 21049-2007 [1] with the name Chinese-sensible code.
In 2011, [7] USA company Association for Automatic Identification and Mobility (AIM) brought out ISS Han Xin Code symbology as official encoding standard and published it in the own store. [8]
In 2015, group of ISO/IEC JTC 1/SC 31 started implementation [9] of Han Xin code as international standard and published it as ISO/IEC 20830:2021 [3] in 2021.
In 2022 Chinese-sensible code standard was reviewed as GB/T 21049-2022 [10] and renamed as Han Xin code to be compliant with ISO standard.
Set of patents is registered in United States Patent and Trademark Office related with Han Xin code encoding and decoding:
Han Xin code can be used in the same way as QR code. At this time Han Xin code is used mostly in China, [14] because it has embedded encoding ability to encode Chinese characters. However, most of barcode printers [15] and barcode scanners [16] support Han Xin code. Han Xin code can be scanned on iOS [17] and Android [18] mobile devices and many barcode libraries [19] [20] support reading and writing Han Xin code.
Main advantages of Han Xin code are:
Han Xin code represents data in black and white square modules, where dark module is a binary one and a light module is a zero. Additionally, Han Xin code can be encoded in inverse colors, [3] : 4.1.2 but this option in many barcode readers is disabled by default. Black and white modules are arranged into square region with sizes from 23 × 23 modules (Version 1) to 189 × 189 modules (Version 84). As QR code, Han Xin code does not have rectangular versions like DataMatrix has and this restricts usage of Han Xin code in some cases. Han Xin code version size can be calculated with the following formula:
Han Xin code symbol is constructed from the following elements: [3] : 4.2
Finder Pattern [3] : 4.2.3 consists from four Position Detection Patterns located at the four corners of the barcode. The size of Position Detection Pattern is 7×7 modules and it is constructed from 5 elements: dark 7 × 7 modules, light 6 × 6 modules, dark 5 × 5 modules, light 4 × 4 modules, dark 3 × 3 modules respectively.
The scanning ratio of each Position Detection Pattern is 1:1:1:1:3 or 3:1:1:1:1 (depends on scanning direction). The four patterns orientation allows to detect unambiguously the barcode location and orientation.
Every pattern has Position Detection Pattern separator [3] : 4.2.4 with Structural Information Region aligned to it.
The Alignment Patterns [3] : 4.2.5 are added to the Han Xin code from Version 4 (Versions 1–3 do not have alignment patterns) and used to precise cell position in the distorted barcodes. Alignment Patterns in Han Xin code are split into:
The Alignment Pattern is made up of a dark line and a downside adjacent light line which are one module wide. Assistant Alignment Pattern consisting from 5 light modules and 1 dark module indicates edge of region block with its dark module.
Below you can see examples of Han Xin code with different Alignment pattern placement.
Han Xin code Structural Information Region [3] : 4.2.7 is a one module wide region surrounding the four Position Detection Patterns. Han Xin code has two Structural Information identical arrays, which are made from 34 data modules. Every Structural Information array is split on 17 modules which are placed around each Position Detection Pattern.
Structural Information Region encodes the following data: [3] : Annex E
Metadata bits from 0–11 are split into 4 bits tetrads(m2, m1, m0) and supplemented with four error correction tetrads (r3, r2, r1, r0).
Version + 20 | Error correction level | Mask index | Error correction codewords | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
m2 | m1 | m0 | r3 | r2 | r1 | r0 | |||||||||||||||||||||
X0 | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 | X11 | X12 | X13 | X14 | X15 | X16 | X17 | X18 | X19 | X20 | X21 | X22 | X23 | X24 | X25 | X26 | X27 |
To make Han Xin code dark and light modules amount to be closely to 1:1 in the symbol, masking algorithm [3] : 5.8.4 is used. Masking sequence is applied to Data Region through the XOR operation. Finder Pattern, Alignment Patterns and Structural Information Regions are excluded from masking operation. The following table shows mask pattern algorithms (which is placed to Structural Information Region).
Condition of masking solution | Data mask pattern reference |
---|---|
Non-masking | 00 |
(i+j) mod 2=0 | 01 |
((i+j)mod 3+( j mod 3)) mod 2=0 | 10 |
(i mod j +j mod i + i mod 3+ j mod 3) mod 2=0 | 11 |
i - Row index of the symbol.
j - Column index of the symbol.
Both i and j start from (1,1), the top left corner module of the symbol. When the masking solution condition is true, the resulting mask bit is 1.
Han Xin code uses Reed–Solomon error correction. Encoded data is represented as byte (8-bit) array. Data array divided into blocks [3] : Annex B and error correction codewords sequence is generated for each block which is added to the end of the error correction block. After this, all blocks are merged sequentially into byte stream.
The polynomial arithmetic for Han Xin Code uses finite field generation polynomial: x^8 + x^6 + x^5 + x (355 or 101100011b) [3] : 5.5 with initial root = 1.
The amount of error correction codewords depends on symbol version and error correction level and can be from 16% to 60%, which allows to correct from 8% to 30% damage. [3] : 5.6.2
Error correction level | Recovery capacity % (approximation) | Encoding of error correction level |
---|---|---|
L1 | 8% | 00 |
L2 | 15% | 01 |
L3 | 23% | 10 |
L4 | 30% | 11 |
Han Xin code data is encoded as byte array. Data byte array is split into error correction blocks, where error correction codewords (bytes) are added. Error correction blocks are united into one codewords array: [3] : 5.8.3
(Encoded byte array) => (Error correction block 1) + ... + (Error correction block N) => (Codewords array)
As an example, this can be demonstrated on Han Xin code version 5 with error correction level L4. It has 27 encoded codewords and 2 error correction blocks with each block size of data codewords and error correction codewords: (14, 20), (13, 22):
(D1...D14, D15...D27) => (D1...D14, E1.1...1.20) + (D15...D27, E2.1...2.22) => (D1...D14, E1.1...1.20, D15...D27, E2.1...2.22) => (C1...C69)
D(x) - Data codewords.
E(b.x) - error codeword, where b is block number and x position in block.
C(x) - resulted codewords.
As the next operation, resulted codewords array C(x) is split into blocks with size of 13 bytes which connects codewords in the same position of each block and form new codewords array. The result is byte array of the same size but mixed by position of 13.
(С1...С13, С14...С26, Сn...Cn+12) => (С1, C14, Cn...С13, С26, Cn+12) => (CM1...CMn+12)
CM(x) – mixed by position of 13 array of codewords (bytes).
After the upper operations the resulted codewords are placed into data region row by row from left to right and from up to down. Horizontal line damage would affect fewer codewords, vertical line damage would affect more codewords.
Han Xin code can encode 7827 numeric characters, 4350 English text characters, 3261 bytes and 1044–2174 Chinese characters in the maximal version 84 version. [3] : Annex C Additionally, it supports special Unicode and industrial modes. All modes can be mixed to obtain best compactification level for the data. The following table demonstrates abilities to encode data with different barcode version and error correction level.
Version | Size | Error correction level | Data codewords | Error correction codewords | Numeric | Text | Bytes | Chinese characters |
---|---|---|---|---|---|---|---|---|
1 | 23×23 | L1 | 21 | 4 | 45 | 26 | 18 | 6–12 |
L4 | 9 | 16 | 15 | 10 | 6 | 2–4 | ||
... | ||||||||
22 | 65×65 | L1 | 354 | 68 | 843 | 470 | 351 | 113–234 |
L4 | 168 | 254 | 399 | 222 | 165 | 53–110 | ||
... | ||||||||
84 | 189×189 | L1 | 3264 | 622 | 7827 | 4350 | 3261 | 1044–2174 |
L4 | 1554 | 2332 | 3723 | 2070 | 1551 | 497–1034 |
All encoding modes can be split into the following groups: [3] : 5.3.1
Mode | Mode indicators | Bits per character |
---|---|---|
Numeric | 0001b | 3.3 (10 bits for three digits) |
Text | 0010b | 6 |
Binary Byte | 0011b | 8 |
Common Chinese Characters in Region One | 0100b | 12 |
Common Chinese Characters in Region Two | 0101b | 12 |
GB18030 2-byte Region | 0110b | 15 |
GB18030 4-byte Region | 0111b | 21 |
ECI | 1000b | Variable (multi-bytes mode) |
Unicode | 1001b | Adaptive (lossless compression) |
GS1 | 11100001b | Variable (Numeric + Text modes) |
URI | 11100010b | Variable (2–7 bits per character) |
The input data string in Numeric mode [3] : 5.4.4 is divided into blocks of three digits (the last block can be less than three) and encoded in 10 bits (0000000000b - 1111100111b). The mode data is prefixed with mode indicator 0001b and terminates with mode terminator which also indicates number of digits in last group.
Numeric characters in last group | Mode terminator |
---|---|
1 | 1111111101b |
2 | 1111111110b |
3 | 1111111111b |
As an example, we need to encode digits sequence 12700402:
Prefix => 0001b
127 => 0001111111
004 => 0000000100
02 => 0000000010
Terminator => 1111111110b
Text mode encodes data characters set from ISO/IEC 646. Each character is represented by 6 bits. [3] : 5.4.5 All characters are divided into two subsets: Text1 sub-mode and Text2 sub-mode. 11110b value is used to switch between text sub-modes, 111111b is a mode terminator. Text mode starts from Text1 sub-mode.
Character | ASCII value | Encoding value | Character | ASCII value | Encoding value | Character | ASCII value | Encoding value |
---|---|---|---|---|---|---|---|---|
0 | 48 | 000000b | L | 76 | 010101b | g | 103 | 101010b |
1 | 49 | 000001b | M | 77 | 010110b | h | 104 | 101011b |
2 | 50 | 000010b | N | 78 | 010111b | i | 105 | 101100b |
3 | 51 | 000011b | O | 79 | 011000b | j | 106 | 101101b |
4 | 52 | 000100b | P | 80 | 011001b | k | 107 | 101110b |
5 | 53 | 000101b | Q | 81 | 011010b | l | 108 | 101111b |
6 | 54 | 000110b | R | 82 | 011011b | m | 109 | 110000b |
7 | 55 | 000111b | S | 83 | 011100b | n | 110 | 110001b |
8 | 56 | 001000b | T | 84 | 011101b | o | 111 | 110010b |
9 | 57 | 001001b | U | 85 | 011110b | p | 112 | 110011b |
A | 65 | 001010b | V | 86 | 011111b | q | 113 | 110100b |
B | 66 | 001011b | W | 87 | 100000b | r | 114 | 110101b |
C | 67 | 001100b | X | 88 | 100001b | s | 115 | 110110b |
D | 68 | 001101b | Y | 89 | 100010b | t | 116 | 110111b |
E | 69 | 001110b | Z | 90 | 100011b | u | 117 | 111000b |
F | 70 | 001111b | a | 97 | 100100b | v | 118 | 111001b |
G | 71 | 010000b | b | 98 | 100101b | w | 119 | 111010b |
H | 72 | 010001b | c | 99 | 100110b | x | 120 | 111011b |
I | 73 | 010010b | d | 100 | 100111b | y | 121 | 111100b |
J | 74 | 010011b | e | 101 | 101000b | z | 122 | 111101b |
K | 75 | 010100b | f | 102 | 101001b |
Character | ASCII value | Encoding value | Character | ASCII value | Encoding value | Character | ASCII value | Encoding value |
---|---|---|---|---|---|---|---|---|
NUL | 0 | 000000b | NAK | 21 | 010101b | . | 46 | 101010b |
SOH | 1 | 000001b | SYN | 22 | 010110b | / | 47 | 101011b |
STX | 2 | 000010b | ETB | 23 | 010111b | : | 58 | 101100b |
ETX | 3 | 000011b | CAN | 24 | 011000b | ; | 59 | 101101b |
EOT | 4 | 000100b | EM | 25 | 011001b | < | 60 | 101110b |
ENQ | 5 | 000101b | SUB | 26 | 011010b | = | 61 | 101111b |
ACK | 6 | 000110b | ESC | 27 | 011011b | > | 62 | 110000b |
BEL | 7 | 000111b | SP | 32 | 011100b | ? | 63 | 110001b |
BS | 8 | 001000b | ! | 33 | 011101b | @ | 64 | 110010b |
HT | 9 | 001001b | ” | 34 | 011110b | [ | 91 | 110011b |
LF | 10 | 001010b | # | 35 | 011111b | \ | 92 | 110100b |
VT | 11 | 001011b | $ | 36 | 100000b | ] | 93 | 110101b |
FF | 12 | 001100b | % | 37 | 100001b | ^ | 94 | 110110b |
CR | 13 | 001101b | & | 38 | 100010b | _ | 95 | 110111b |
SO | 14 | 001110b | ‘ | 39 | 100011b | ` | 96 | 111000b |
SI | 15 | 001111b | ( | 40 | 100100b | { | 123 | 111001b |
DLE | 16 | 010000b | ) | 41 | 100101b | | | 124 | 111010b |
DC1 | 17 | 010001b | * | 42 | 100110b | } | 125 | 111011b |
DC2 | 18 | 010010b | + | 43 | 100111b | ~ | 126 | 111100b |
DC3 | 19 | 010011b | , | 44 | 101000b | DEL | 27 | 111101b |
DC4 | 20 | 010100b | - | 45 | 101001b |
Binary mode encodes bytes array [0 – 255] in any form. Binary mode [3] : 5.4.6 consists from binary mode indicator 0011b, 13-bit binary counter and bytes data which are converted to 8-bit sequence. None mode terminator is required.
Chinese Characters modes is a set of 4 modes which encodes Chinese characters from GB 18030 codepage.
Mode | Mode indicator | Bits | Encoding characters count | Description |
---|---|---|---|---|
Common Chinese Characters in Region One mode [3] : 5.4.7 | 0100b | 12 | 4074 | Encodes characters from GB 18030 regions, which: first byte value is in the range of B0 to D7 and second byte value is in the range of A1 to FE (3760 characters), first byte value is in the range of A1 to A3 and second byte value is in the range of A1 to FE (282 characters), in the range of A8A1 to A8C0 (32 characters). |
Common Chinese Characters in Region Two mode [3] : 5.4.8 | 0101b | 12 | 3008 | Encodes characters from GB 18030 region, which first byte value is in the range of D8 to F7 and second byte value is in the range of A1 to FE (3008 characters). |
GB18030 2-byte Region mode [3] : 5.4.9 | 0110b | 15 | 23940 | Encodes characters from GB 18030 region, which first byte value is in the range of 81 to FE and second byte value is in the range of 40 to 7E or 80 to FE (23940 characters). |
GB18030 4-byte Region mode [3] : 5.4.10 | 0111b | 21 | 1587600 | Encodes characters from GB 18030 region, which first byte value is in the range of 81 to FE, and second byte value is in the range of 30 to 39, and third byte value is in the range of 81 to FE, and fourth byte value is in the range of 30 to 39 (1587600 characters). |
Unicode mode [3] : 5.4.12 encodes UTF-8 charset with embedded lossless compression. In the Unicode mode, the input data is analysed by using self-adaptive algorithm. Firstly, input data is divided and combined into the 1, 2, 3, or 4 byte pattern preencoding sub-sequences, and secondly a run-length data compression algorithm is applied to encode each sub-sequences of the input data.
Shortly, the Unicode mode searches characters sub-pages which can have the same prefix sequence for all of characters of the same language (Cyrillic, Greek, French, German... languages) and encodes only differences from prefix bytes sequence.
Han Xin code GS1 mode [3] : 5.4.13 is an indicator that the represented data is defined by GS1 General Specification. GS1 mode encodes data in Numeric and Text modes. Other modes may be used but GS1 mode must be first mode in the symbol and encoded data must be returned with GS1 flag. <FNC1> (if required) must be encoded as 1111101000b in Numeric mode (Numeric mode encodes only three digits, so 1111101000b => 1000 value is counted as special character). In case <FNC1> identifier must be inserted and encoder is in any mode different from Numeric, the mode must be terminated and Numeric mode must be started. GS1 mode indicator is 11100001b and GS1 mode terminator is 11111111b.
The data in GS1 mode is split into GS1 Application Identifiers chinks and then compacted with the best modes. As an example, the following data can be encoded:
(10)123456ABC<FNC1>(240)DATA
The data is encoded in the following way:
<11100001b> <Numeric 10123456> <Text ABC> <Numeric mode selector> <1111101000b> <Numeric 240> <Text DATA> <11111111b>
Han Xin code URI mode [3] : 5.4.14 encodes URI links in compact encoding. URI mode indicator is 11100010b and URI mode terminator is 111b. URI mode can encode data in three charsets: URI-A, URI-B, URI-C [3] : Annex M with own sub-mode terminators. URI mode can encode %XX data in special Percent-Encoding sub-mode, where three symbols is encoded in 8 bits.
Charset | Charset indicator |
---|---|
URI-A | 001b |
URI-B | 010b |
URI-C | 011b |
Percent-Encoding | 100b |
URI Mode Teminator | 111b |
Percent-Encoding sub-mode encodes %XX data in 8 bits sequence. The mode does not require any terminator. To encode URI %XX data in this mode, sub-mode indicator (100b) must be added, then 8-bit indicator of sub-mode 8 bits sequence must be added (counter = Length of %XX / 3) and after this sequence, where %FF, or %ff, or %00, must be added as xFF or x00 bytes.
URI-A charset | URI-B charset | ||||
---|---|---|---|---|---|
Character / URI fragment | Encoding value | Encoding bits | Character / URI fragment | Encoding value | Encoding bits |
a | 0 | 000000 | A | 0 | 000000 |
b | 1 | 000001 | B | 1 | 000001 |
c | 2 | 000010 | C | 2 | 000010 |
d | 3 | 000011 | D | 3 | 000011 |
e | 4 | 000100 | E | 4 | 000100 |
f | 5 | 000101 | F | 5 | 000101 |
g | 6 | 000110 | G | 6 | 000110 |
h | 7 | 000111 | H | 7 | 000111 |
i | 8 | 001000 | I | 8 | 001000 |
j | 9 | 001001 | J | 9 | 001001 |
k | 10 | 001010 | K | 10 | 001010 |
l | 11 | 001011 | L | 11 | 001011 |
m | 12 | 001100 | M | 12 | 001100 |
n | 13 | 001101 | N | 13 | 001101 |
o | 14 | 001110 | O | 14 | 001110 |
p | 15 | 001111 | P | 15 | 001111 |
q | 16 | 010000 | Q | 16 | 010000 |
r | 17 | 010001 | R | 17 | 010001 |
s | 18 | 010010 | S | 18 | 010010 |
t | 19 | 010011 | T | 19 | 010011 |
u | 20 | 010100 | U | 20 | 010100 |
v | 21 | 010101 | V | 21 | 010101 |
w | 22 | 010110 | W | 22 | 010110 |
x | 23 | 010111 | X | 23 | 010111 |
y | 24 | 011000 | Y | 24 | 011000 |
z | 25 | 011001 | Z | 25 | 011001 |
0 | 26 | 011010 | ! | 26 | 011010 |
1 | 27 | 011011 | * | 27 | 011011 |
2 | 28 | 011100 | ( | 28 | 011100 |
3 | 29 | 011101 | ) | 29 | 011101 |
4 | 30 | 011110 | , | 30 | 011110 |
5 | 31 | 011111 | { | 31 | 011111 |
6 | 32 | 100000 | } | 32 | 100000 |
7 | 33 | 100001 | | | 33 | 100001 |
8 | 34 | 100010 | \ | 34 | 100010 |
9 | 35 | 100011 | ^ | 35 | 100011 |
. | 36 | 100100 | [ | 36 | 100100 |
/ | 37 | 100101 | ] | 37 | 100101 |
- | 38 | 100110 | ' | 38 | 100110 |
_ | 39 | 100111 | < | 39 | 100111 |
~ | 40 | 101000 | > | 40 | 101000 |
: | 41 | 101001 | % | 41 | 101001 |
@ | 42 | 101010 | " | 42 | 101010 |
? | 43 | 101011 | ; | 43 | 101011 |
# | 44 | 101100 | .htm | 44 | 101100 |
= | 45 | 101101 | .html | 45 | 101101 |
+ | 46 | 101110 | .asp | 46 | 101110 |
$ | 47 | 101111 | .aspx | 47 | 101111 |
& | 48 | 110000 | .php | 48 | 110000 |
http:// | 49 | 110001 | .jsp | 49 | 110001 |
https:// | 50 | 110010 | gtin | 50 | 110010 |
ftp:// | 51 | 110011 | ser | 51 | 110011 |
mailto: | 52 | 110100 | bat | 52 | 110100 |
ldap:// | 53 | 110101 | exp | 53 | 110101 |
tel: | 54 | 110110 | search | 54 | 110110 |
urn: | 55 | 110111 | id | 55 | 110111 |
www. | 56 | 111000 | .jp | 56 | 111000 |
.com | 57 | 111001 | .it | 57 | 111001 |
.net | 58 | 111010 | .de | 58 | 111010 |
.gov | 59 | 111011 | .br | 59 | 111011 |
.org | 60 | 111100 | .fr | 60 | 111100 |
.cn | 61 | 111101 | gs1 | 61 | 111101 |
Jump to URI-B | 62 | 111110 | Jump to URI-A | 62 | 111110 |
Terminator of URI-A | 63 | 111111 | Terminator of URI-B | 63 | 111111 |
Character / URI fragment | Encoding value | Encoding bits | Character / URI fragment | Encoding value | Encoding bits | Character / URI fragment | Encoding value | Encoding bits |
---|---|---|---|---|---|---|---|---|
A | 0 | 0000000 | R | 43 | 0101011 | ; | 86 | 1010110 |
B | 1 | 0000001 | S | 44 | 0101100 | / | 87 | 1010111 |
C | 2 | 0000010 | T | 45 | 0101101 | ? | 88 | 1011000 |
D | 3 | 0000011 | U | 46 | 0101110 | : | 89 | 1011001 |
E | 4 | 0000100 | V | 47 | 0101111 | @ | 90 | 1011010 |
F | 5 | 0000101 | W | 48 | 0110000 | & | 91 | 1011011 |
G | 6 | 0000110 | X | 49 | 0110001 | = | 92 | 1011100 |
H | 7 | 0000111 | Y | 50 | 0110010 | http:// | 93 | 1011101 |
I | 8 | 0001000 | Z | 51 | 0110011 | https:// | 94 | 1011110 |
J | 9 | 0001001 | 0 | 52 | 0110100 | ftp:// | 95 | 1011111 |
K | 10 | 0001010 | 1 | 53 | 0110101 | mailto: | 96 | 1100000 |
L | 11 | 0001011 | 2 | 54 | 0110110 | ldap:// | 97 | 1100001 |
m | 12 | 0001100 | 3 | 55 | 0110111 | tel: | 98 | 1100010 |
N | 13 | 0001101 | 4 | 56 | 0111000 | urn: | 99 | 1100011 |
O | 14 | 0001110 | 5 | 57 | 0111001 | www. | 100 | 1100100 |
P | 15 | 0001111 | 6 | 58 | 0111010 | .com | 101 | 1100101 |
Q | 16 | 0010000 | 7 | 59 | 0111011 | .net | 102 | 1100110 |
R | 17 | 0010001 | 8 | 60 | 0111100 | .gov | 103 | 1100111 |
S | 18 | 0010010 | 9 | 61 | 0111101 | .org | 104 | 1101000 |
T | 19 | 0010011 | $ | 62 | 0111110 | .cn | 105 | 1101001 |
U | 20 | 0010100 | - | 63 | 0111111 | .htm | 106 | 1101010 |
V | 21 | 0010101 | _ | 64 | 1000000 | .html | 107 | 1101011 |
w | 22 | 0010110 | . | 65 | 1000001 | .asp | 108 | 1101100 |
X | 23 | 0010111 | + | 66 | 1000010 | .aspx | 109 | 1101101 |
Y | 24 | 0011000 | ! | 67 | 1000011 | .php | 110 | 1101110 |
Z | 25 | 0011001 | * | 68 | 1000100 | .jsp | 111 | 1101111 |
A | 26 | 0011010 | ( | 69 | 1000101 | gtin | 112 | 1110000 |
B | 27 | 0011011 | ) | 70 | 1000110 | ser | 113 | 1110001 |
C | 28 | 0011100 | , | 71 | 1000111 | bat | 114 | 1110010 |
D | 29 | 0011101 | { | 72 | 1001000 | exp | 115 | 1110011 |
E | 30 | 0011110 | } | 73 | 1001001 | search | 116 | 1110100 |
F | 31 | 0011111 | | | 74 | 1001010 | id | 117 | 1110101 |
G | 32 | 0100000 | \ | 75 | 1001011 | .jp | 118 | 1110110 |
H | 33 | 0100001 | ^ | 76 | 1001100 | .it | 119 | 1110111 |
I | 34 | 0100010 | ~ | 77 | 1001101 | .de | 120 | 1111000 |
J | 35 | 0100011 | [ | 78 | 1001110 | .br | 121 | 1111001 |
K | 36 | 0100100 | ] | 79 | 1001111 | .fr | 122 | 1111010 |
L | 37 | 0100101 | ' | 80 | 1010000 | gs1 | 123 | 1111011 |
M | 38 | 0100110 | < | 81 | 1010001 | search | 124 | 1111100 |
N | 39 | 0100111 | > | 82 | 1010010 | Jump to URI-A | 125 | 1111101 |
O | 40 | 0101000 | # | 83 | 1010011 | Jump to URI-B | 126 | 1111110 |
P | 41 | 0101001 | % | 84 | 1010100 | Terminator of URI-C | 127 | 1111111 |
Q | 42 | 0101010 | " | 85 | 1010101 |
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map".
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.
UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.
ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters).
GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix denotes a non-mandatory standard.
PDF417 is a stacked linear barcode format used in a variety of applications such as transport, identification cards, and inventory management. "PDF" stands for Portable Data File. The "417" signifies that each pattern in the code consists of 4 bars and spaces in a pattern that is 17 units (modules) long. The PDF417 symbology was invented by Dr. Ynjiun P. Wang at Symbol Technologies in 1991. It is defined in ISO 15438.
The Aztec Code is a matrix code invented by Andrew Longacre, Jr. and Robert Hussey in 1995. The code was published by AIM, Inc. in 1997. Although the Aztec Code was patented, that patent was officially made public domain. The Aztec Code is also published as ISO/IEC 24778:2008 standard. Named after the resemblance of the central finder pattern to an Aztec pyramid, Aztec Code has the potential to use less space than other matrix barcodes because it does not require a surrounding blank "quiet zone".
MaxiCode is a public domain, machine-readable symbol system originally created by the United Parcel Service (UPS) in 1992. Suitable for tracking and managing the shipment of packages, it resembles an Aztec Code or QR code, but uses dots arranged in a hexagonal grid instead of square grid. MaxiCode has been standardised under ISO/IEC 16023.
Code 128 is a high-density linear barcode symbology defined in ISO/IEC 15417:2007. It is used for alphanumeric or numeric-only barcodes. It can encode all 128 characters of ASCII and, by use of an extension symbol (FNC4), the Latin-1 characters defined in ISO/IEC 8859-1. It generally results in more compact barcodes compared to other methods like Code 39, especially when the texts contain mostly digits. Code 128 was developed by the Computer Identics Corporation in 1981.
A QR code is a type of two-dimensional matrix barcode, invented in 1994, by Japanese company Denso Wave for labelling automobile parts. It features black squares on a white background with fiducial markers, readable by imaging devices like cameras, and processed using Reed–Solomon error correction until the image can be appropriately interpreted. The required data are then extracted from patterns that are present in both the horizontal and the vertical components of the QR image.
A Data Matrix is a two-dimensional code consisting of black and white "cells" or dots arranged in either a square or rectangular pattern, also known as a matrix. The information to be encoded can be text or numeric data. Usual data size is from a few bytes up to 1556 bytes. The length of the encoded data depends on the number of cells in the matrix. Error correction codes are often used to increase reliability: even if one or more cells are damaged so it is unreadable, the message can still be read. A Data Matrix symbol can store up to 2,335 alphanumeric characters.
KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.
The Universal Coded Character Set is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.
The CCITT Chinese Primary Set is a multi-byte graphic character set for Chinese communications created for the Consultative Committee on International Telephone and Telegraph (CCITT) in 1992. It is defined in ITU T.101, annex C, which codifies Data Syntax 2 Videotex. It is registered with the ISO-IR registry for use with ISO/IEC 2022 as ISO-IR-165, and encodable in the ISO-2022-CN-EXT code version.
Barcode library or Barcode SDK is a software library that can be used to add barcode features to desktop, web, mobile or embedded applications. Barcode library presents sets of subroutines or objects which allow to create barcode images and put them on surfaces or recognize machine-encoded text / data from scanned or captured by camera images with embedded barcodes. The library can support two modes: generation and recognition mode, some libraries support barcode reading and writing in the same way, but some libraries support only one mode.
MicroPDF417 is two-dimensional (2D) stacked barcode symbology invented in 1996, by Frederick Schuessler, Kevin Hunter, Sundeep Kumar and Cary Chu from Symbol Technologies company. MicroPDF417 consists from specially encoded Row Address Patterns (RAP) columns and aligned to them Data columns encoded in "417" sequence which was invented in 1990. In 2006, the standard was registered as ISO/IEC 24728:2006.
DotCode is two-dimensional (2D) matrix barcode invented in 2008 by Hand Held Products company to replace outdated Code 128. At this time, it is issued by Association for Automatic Identification and Mobility (AIM) as “ISS DotCode Symbology Specification 4.0”. DotCode consists of sparse black round dots and white spaces on white background. In case of black background round dots, creating barcode, can be white. DotCode was developed to use with high-speed industrial printers where printing accuracy can be low. Because DotCode by the standard does not require complicated elements like continuous lines or special shapes it can be applied with laser engraving or industrial drills.
Rectangular Micro QR Code is two-dimensional (2D) matrix barcode invented and standardized in 2022 by Denso Wave as ISO/IEC 23941. rMQR Code is designed as a rectangular variation of QR code and has the same parameters and applications as original QR code. But rMQR Code is more suitable for the rectangular areas and has difference between width and height up to 19 in R7x139 version. In this way it can be used in places where 1D barcodes are used. rMQR Code can replace Code 128 and Code 39 barcodes with more effective data encoding.