Comparison of data-serialization formats

Last updated

This is a comparison of data serialization formats , various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.

Contents

Overview

NameCreator-maintainerBased onStandardized?[ definition needed ] Specification Binary? Human-readable?Supports references? e Schema-IDL?Standard APIs Supports zero-copy operations
Apache Avro Apache Software Foundation No Apache Avro™ Specification YesPartial g Built-inC, C#, C++, Java, PHP, Python, Ruby
Apache Parquet Apache Software Foundation No Apache Parquet YesNoNoJava, Python, C++No
Apache Thrift Facebook (creator)
Apache (maintainer)
No Original whitepaper YesPartial c NoBuilt-inC++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages [1]
ASN.1 ISO, IEC, ITU-T YesISO/IEC 8824 / ITU-T X.680 (syntax) and ISO/IEC 8825 / ITU-T X.690 (encoding rules) series. X.680, X.681, and X.683 define syntax and semantics. BER, DER, PER, OER, or custom via ECN XER, JER, GSER, or custom via ECN Yes f Built-in OER
Bencode Bram Cohen (creator)
BitTorrent, Inc. (maintainer)
De facto as BEPPart of BitTorrent protocol specification Except numbers and delimiters, being ASCIINoNoNoNoNo
BSON MongoDB JSON No BSON Specification YesNoNoNoNoNo
Cap'n Proto Kenton VardaNo Cap'n Proto Encoding Spec YesPartial h NoYesNoYes
CBOR Carsten Bormann, P. Hoffman MessagePack [2] Yes RFC 8949 YesNoYes,
through tagging
CDDL NoNo
Comma-separated values (CSV)RFC author:
Yakov Shafranovich
A myriad of informal variants RFC 4180
(among others)
NoYesNoNoNoNo
Common Data Representation (CDR) Object Management Group Yes General Inter-ORB Protocol YesNoYesYesAda, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk
D-Bus Message Protocol freedesktop.org Yes D-Bus Specification YesNoNoPartial
(Signature strings)
Yes
Efficient XML Interchange (EXI) W3C XML, Efficient XMLYes Efficient XML Interchange (EXI) Format 1.0 Yes XML XPointer, XPath XML Schema DOM, SAX, StAX, XQuery, XPath
Extensible Data Notation (edn) Rich Hickey / Clojure community Clojure Yes Official edn spec NoYesNoNoClojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python [3] No
FlatBuffers GoogleNo Flatbuffers GitHub Yes Apache Arrow Partial
(internal to the buffer)
Yes C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScriptYes
Fast Infoset ISO, IEC, ITU-T XML YesITU-T X.891 and ISO/IEC 24824-1:2007YesNo XPointer, XPath XML schema DOM, SAX, XQuery, XPath
FHIR Health Level 7 REST basicsYes Fast Healthcare Interoperability Resources YesYesYesYesHapi for FHIR [4] JSON, XML, Turtle No
Ion Amazon JSON No The Amazon Ion Specification YesYesNo Ion schema C, C#, Go, Java, JavaScript, Python, Rust
Java serialization Oracle Corporation Yes Java Object Serialization YesNoYesNoYes
JSON Douglas Crockford JavaScript syntax Yes STD 90/RFC 8259
(ancillary:
RFC 6901,
RFC 6902), ECMA-404, ISO/IEC 21778:2017
No, but see BSON, Smile, UBJSON Yes JSON Pointer (RFC 6901), or alternately, JSONPath, JPath, JSPON, json:select(); and JSON-LD Partial
(JSON Schema Proposal, ASN.1 with JER, Kwalify Archived 2021-08-12 at the Wayback Machine , Rx, JSON-LD
Partial
(Clarinet, JSONQuery / RQL, JSONPath), JSON-LD
No
MessagePack Sadayuki Furuhashi JSON (loosely)No MessagePack format specification YesNoNoNoNoYes
Netstrings Dan Bernstein No netstrings.txt Except ASCII delimitersYesNoNoNoYes
OGDL Rolf Veen?No Specification Binary specification Yes Path specification Schema WD
OPC-UA Binary OPC Foundation No opcfoundation.org YesNoYesNoNo
OpenDDL Eric Lengyel C, PHP No OpenDDL.org NoYesYesNo OpenDDL library
PHP serialization format PHP GroupYesNoYesYesYesNoYes
Pickle (Python) Guido van Rossum Python De facto as PEPs PEP 3154 – Pickle protocol version 4 YesNoYes [5] NoYesNo
Property list NeXT (creator)
Apple (maintainer)
?Partial Public DTD for XML format Yes a Yes b No? Cocoa, CoreFoundation, OpenStep, GnuStep No
Protocol Buffers (protobuf) Google No Developer Guide: Encoding, proto2 specification, and proto3 specification YesYes d NoBuilt-inC++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, Erlang, D, Haskell, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, Typescript, Vala, Visual BasicNo
S-expressions John McCarthy (original)
Ron Rivest (internet draft)
Lisp, Netstrings Largely de facto "S-Expressions" Archived 2013-10-07 at the Wayback Machine Internet Draft Yes, canonical representationYes, advanced transport representationNoNo
Smile Tatu Saloranta JSON No Smile Format Specification YesNoYesPartial
(JSON Schema Proposal, other JSON schemas/IDLs)
Partial
(via JSON APIs implemented with Smile backend, on Jackson, Python)
SOAP W3C XML Yes W3C Recommendations:
SOAP/1.1
SOAP/1.2
Partial
( Efficient XML Interchange , Binary XML , Fast Infoset , MTOM, XSD base64 data)
YesBuilt-in id/ref, XPointer, XPath WSDL, XML schema DOM, SAX, XQuery, XPath
Structured Data eXchange Formats Max Wildgrube Yes RFC 3072 YesNoNoNo
UBJSON The Buzz Media, LLC JSON, BSON No ubjson.org YesNoNoNoNo
eXternal Data Representation (XDR) Sun Microsystems (creator)
IETF (maintainer)
Yes STD 67/RFC 4506 YesNoYesYesYes
XML W3C SGML Yes W3C Recommendations:
1.0 (Fifth Edition)
1.1 (Second Edition)
Partial
( Efficient XML Interchange , Binary XML , Fast Infoset , XSD base64 data)
Yes XPointer, XPath XML schema, RELAX NG DOM, SAX, XQuery, XPath
XML-RPC Dave Winer [6] XML No XML-RPC Specification NoYesNoNoNoNo
YAML Clark Evans,
Ingy döt Net,
and Oren Ben-Kiki
C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON [7] No Version 1.2 NoYesYesPartial
(Kwalify Archived 2021-08-12 at the Wayback Machine , Rx, built-in language type-defs)
NoNo
NameCreator-maintainerBased onStandardized? Specification Binary? Human-readable?Supports references? e Schema-IDL?Standard APIs Supports zero-copy operations
  1. ^ The current default format is binary.
  2. ^ The "classic" format is plain text, and an XML format is also supported.
  3. ^ Theoretically possible due to abstraction, but no implementation is included.
  4. ^ The primary format is binary, but text and JSON formats are available. [8] [9]
  5. ^ Means that generic tools/libraries know how to encode, decode, and dereference a reference to another piece of data in the same document. A tool may require the IDL file, but no more. Excludes custom, non-standardized referencing techniques.
  6. ^ ASN.1 has X.681 (Information Object System), X.682 (Constraints), and X.683 (Parameterization) that allow for the precise specification of open types where the types of values can be identified by integers, by OIDs, etc. OIDs are a standard format for globally unique identifiers, as well as a standard notation ("absolute reference") for referencing a component of a value. For example, PKIX uses such notation in RFC 5912. With such notation (constraints on parameterized types using information object sets), generic ASN.1 tools/libraries can automatically encode/decode/resolve references within a document.
  7. ^ The primary format is binary, a json encoder is available. [10]
  8. ^ The primary format is binary, but a text format is available.

Syntax comparison of human-readable formats

Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object
ASN.1
(XML Encoding Rules)
<foo /><foo>true</foo><foo>false</foo><foo>685230</foo><foo>6.8523015e+5</foo><foo>A to Z</foo>
<SeqOfUnrelatedDatatypes><isMarried>true</isMarried><hobby/><velocity>-42.1e7</velocity><bookname>AtoZ</bookname><bookname>Wesaid,"no".</bookname></SeqOfUnrelatedDatatypes>
An object (the key is a field name):
<person><isMarried>true</isMarried><hobby/><height>1.85</height><name>BobPeterson</name></person>

A data mapping (the key is a data value):

<competition><measurement><name>John</name><height>3.14</height></measurement><measurement><name>Jane</name><height>2.718</height></measurement></competition>

a

CSV b null a
(or an empty element in the row) a
1 a
true a
0 a
false a
685230
-685230 a
6.8523015e+5 a A to Z
"We said, ""no""."
true,,-42.1e7,"A to Z"
42,1 A to Z,1,2,3
edn niltruefalse685230
-685230
6.8523015e+5"A to Z", "A \"up to\" Z"[true nil -42.1e7 "A to Z"]{:kw 1, "42" true, "A to Z" [1 2 3]}
Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object
Ion

null
null.null
null.bool
null.int
null.float
null.decimal
null.timestamp
null.string
null.symbol
null.blob
null.clob
null.struct
null.list
null.sexp

truefalse685230
-685230
0xA74AE
0b111010010101110
6.8523015e5"A to Z"

'''
A
to
Z
'''
[true,null,-42.1e7,"A to Z"]
{'42':true,'A to Z':[1,2,3]}
Netstrings c 0:, a
4:null, a
1:1, a
4:true, a
1:0, a
5:false, a
6:685230, a 9:6.8523e+5, a 6:A to Z,29:4:true,0:,7:-42.1e7,6:A to Z,,41:9:2:42,1:1,,25:6:A to Z,12:1:1,1:2,1:3,,,, a
JSON nulltruefalse685230
-685230
6.8523015e+5"A to Z"
[true,null,-42.1e7,"A to Z"]
{"42":true,"A to Z":[1,2,3]}
OGDL [ verification needed ]null a true a false a 685230 a 6.8523015e+5 a "A to Z"
'A to Z'
NoSpaces
true null -42.1e7 "A to Z"

(true, null, -42.1e7, "A to Z")

42   true "A to Z"   1   2   3
42   true "A to Z", (1, 2, 3)
Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object
OpenDDL ref {null}bool {true}bool {false}int32 {685230}
int32 {0x74AE}
int32 {0b111010010101110}
float {6.8523015e+5}string {"A to Z"}Homogeneous array:
int32 {1, 2, 3, 4, 5}

Heterogeneous array:

array {     bool {true}     ref {null}     float {-42.1e7}     string {"A to Z"} }
dict {     value (key = "42") {bool {true}}     value (key = "A to Z") {int32 {1, 2, 3}} }
PHP serialization format N;b:1;b:0;i:685230;
i:-685230;
d:685230.15; d
d:INF;
d:-INF;
d:NAN;
s:6:"A to Z";a:4:{i:0;b:1;i:1;N;i:2;d:-421000000;i:3;s:6:"A to Z";}Associative array:
a:2:{i:42;b:1;s:6:"A to Z";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}
Object:
O:8:"stdClass":2:{s:4:"John";d:3.14;s:4:"Jane";d:2.718;} d
Pickle (Python) N.I01\n.I00\n.I685230\n.F685230.15\n.S'A to Z'\n.(lI01\na(laF-421000000.0\naS'A to Z'\na.(dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas.
Property list
(plain text format) [11]
<*BY><*BN><*I685230><*R6.8523015e+5>"A to Z"( <*BY>, <*R-42.1e7>, "A to Z" )
{     "42" = <*BY>;     "A to Z" = ( <*I1>, <*I2>, <*I3> ); }
Property list
(XML format) [12]
<true /><false /><integer>685230</integer><real>6.8523015e+5</real><string>A to Z</string>
<array><true/><real>-42.1e7</real><string>AtoZ</string></array>
<dict><key>42</key><true/><key>AtoZ</key><array><integer>1</integer><integer>2</integer><integer>3</integer></array></dict>
Protocol Buffers truefalse685230
-685230
20.0855369"A to Z"
"sdfff2 \000\001\002\377\376\375"
"q\tqq<>q2&\001\377"
field1: "value1" field1: "value2" field1: "value3 
anotherfield {   foo: 123   bar: 456 } anotherfield {   foo: 222   bar: 333 } 
thing1:"blahblah"thing2:18923743thing3:-44thing4{submessage_field1:"foo"submessage_field2:false}enumeratedThing:SomeEnumeratedValuething5:123.456[extensionFieldFoo]:"etc"[extensionFieldThatIsAnEnum]:EnumValue
Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object
S-expressions NIL
nil
T
#t f
true
NIL
#f f
false
6852306.8523015e+5abc
"abc"
#616263#
3:abc
{MzphYmM=}
|YWJj|
(T NIL -42.1e7 "A to Z")((42 T) ("A to Z" (1 2 3)))
YAML ~
null
Null
NULL [13]
y
Y
yes
Yes
YES
on
On
ON
true
True
TRUE [14]
n
N
no
No
NO
off
Off
OFF
false
False
FALSE [14]
685230
+685_230
-685230
02472256
0x_0A_74_AE
0b1010_0111_0100_1010_1110
190:20:30 [15]
6.8523015e+5
685.230_15e+03
685_230.15
190:20:30.15
.inf
-.inf
.Inf
.INF
.NaN
.nan
.NAN [16]
A to Z
"A to Z"
'A to Z'
[y, ~, -42.1e7, "A to Z"]
- y - - -42.1e7 - A to Z
{"John":3.14, "Jane":2.718}
42: y A to Z: [1, 2, 3]
XML e and SOAP <null /> a truefalse6852306.8523015e+5A to Z
<item>true</item><itemxsi:nil="true"/><item>-42.1e7</item><item>AtoZ<item>
<map><entrykey="42">true</entry><entrykey="A to Z"><itemval="1"/><itemval="2"/><itemval="3"/></entry></map>
XML-RPC <value><boolean>1</boolean></value><value><boolean>0</boolean></value><value><int>685230</int></value><value><double>6.8523015e+5</double></value><value><string>A to Z</string></value>
<value><array><data><value><boolean>1</boolean></value><value><double>-42.1e7</double></value><value><string>AtoZ</string></value></data></array></value>
<value><struct><member><name>42</name><value><boolean>1</boolean></value></member><member><name>AtoZ</name><value><array><data><value><int>1</int></value><value><int>2</int></value><value><int>3</int></value></data></array></value></member></struct>
  1. ^ Omitted XML elements are commonly decoded by XML data binding tools as NULLs. Shown here is another possible encoding; XML schema does not define an encoding for this datatype.
  2. ^ The RFC CSV specification only deals with delimiters, newlines, and quote characters; it does not directly deal with serializing programming data structures.
  3. ^ The netstrings specification only deals with nested byte strings; anything else is outside the scope of the specification.
  4. ^ PHP will unserialize any floating-point number correctly, but will serialize them to their full decimal expansion. For example, 3.14 will be serialized to 3.140000000000000124344978758017532527446746826171875.
  5. ^ XML data bindings and SOAP serialization tools provide type-safe XML serialization of programming data structures into XML. Shown are XML values that can be placed in XML elements and attributes.
  6. ^ This syntax is not compatible with the Internet-Draft, but is used by some dialects of Lisp.

Comparison of binary formats

Format Null Booleans Integer Floating-point String Array Associative array/object
ASN.1
(BER, PER or OER encoding)
NULL typeBOOLEAN:
  • BER: as 1 byte in binary form;
  • PER: as 1 bit;
  • OER: as 1 byte
INTEGER:
  • BER: variable-length big-endian binary representation (up to 221024 bits);
  • PER Unaligned: a fixed number of bits if the integer type has a finite range; a variable number of bits otherwise;
  • PER Aligned: a fixed number of bits if the integer type has a finite range and the size of the range is less than 65536; a variable number of octets otherwise;
  • OER: 1, 2, or 4 octets (either signed or unsigned) if the integer type has a finite range that fits in that number of octets; a variable number of octets otherwise
REAL:
  • base-10 real values are represented as character strings in ISO 6093 format;
  • binary real values are represented in a binary format that includes the mantissa, the base (2, 8, or 16), and the exponent;
  • the special values NaN, -INF, +INF, and negative zero are also supported
Multiple valid types (VisibleString, PrintableString, GeneralString, UniversalString, UTF8String)Data specifications SET OF (unordered) and SEQUENCE OF (guaranteed order)User definable type
BSON \x0A
(1 byte)
True: \x08\x01
False: \x08\x00
(2 bytes)
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement Double: little-endian binary64 UTF-8-encoded, preceded by int32-encoded string length in bytes BSON embedded document with numeric keys BSON embedded document
Concise Binary Object Representation (CBOR)\xf6
(1 byte)
  • True: \xf5
  • False: \xf4

(1 byte)

  • Small positive/negative \x00\x17 & \x20\x37 (1 byte)
  • 8-bit: positive \x18, negative \x38 (+ 1 byte)
  • 16-bit: positive \x19, negative \x39 (+ 2 bytes)
  • 32-bit: positive \x1A, negative \x3A (+ 4 bytes)
  • 64-bit: positive \x1B, negative \x3B (+ 8 bytes)
  • Negative x encoded as (−x − 1)
  • IEEE half/single/double \xf9\xfb (+ 2–8 bytes)
  • Decimals and bigfloats (4+ bytes) encoded as \xc4 tag + 2-item array of integer mantissa & exponent
  • Length and content (1–9 bytes overhead)
  • Bytestring \x40\x5f
  • UTF-8 \x60\x7f
  • Indefinite partial strings \x5f and \x7f stitched together until \xff.
  • Length and items \x80\x9e
  • Indefinite list \x9f terminated by \xff entry.
  • Length (in pairs) and items \xa0\xbe
  • Indefinite map \xbf terminated by \xff key.
Efficient XML Interchange (EXI) [lower-alpha 1]

(Unpreserved lexical values format)

xsi:nil is not allowed in binary context.1–2 bit integer interpreted as boolean.Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number.

Unsigned skips the boolean flag.

  • Float: integer mantissa and integer exponent.
  • Decimal: boolean sign, integer whole value, integer fractional.
Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead.Length prefixed set of items.Not in protocol.
FlatBuffers Encoded as absence of field in parent object
  • True: \x01
  • False: \x00

(1 byte)

Little-endian 2's complement signed and unsigned 8/16/32/64 bits UTF-8-encoded, preceded by 32-bit integer length of string in bytesVectors of any other type, preceded by 32-bit integer length of number of elementsTables (schema defined types) or Vectors sorted by key (maps / dictionaries)
Ion [18] \x0f [lower-alpha 2]
  • True: \x11
  • False: \x10
  • Positive \x2x, negative \x3x
  • Zero is always encoded in tag byte.
  • BigInts over 13 bytes (104 bits) have 1+ byte overhead for length
  • \x44 (32-bit float)
  • \x48 (64-bit float)
  • Zero is always encoded in tag byte.
  • UTF-8: \x8x
  • Other strings: \x9x
  • Arbitrary length and overhead
\xbx Arbitrary length and overhead. Length in octets.
  • Structs (numbered fields): \xdx
  • Annotations (named fields): \xex
MessagePack \xc0
  • True: \xc3
  • False: \xc2
  • Single byte "fixnum" (values −32 – 127)
  • or typecode (1 byte) + big-endian (u)int8/16/32/64
Typecode (1 byte) + IEEE single/double
  • Typecode + up to 15 bytes
  • or typecode + length as uint8/16/32 + bytes;

encoding is unspecified [19]

  • As "fixarray" (single-byte prefix + up to 15 array items)
  • or typecode (1 byte) + 2–4 bytes length + array items
  • As "fixmap" (single-byte prefix + up to 15 key-value pairs)
  • or typecode (1 byte) + 2–4 bytes length + key-value pairs
Netstrings [lower-alpha 3] Not in protocol.Not in protocol.Not in protocol.Length-encoded as an ASCII string + ':' + data + ','

Length counts only octets between ':' and ','

Not in protocol.Not in protocol.Not in protocol.
OGDL Binary
Property list
(binary format)
Protocol Buffers
  • Variable encoding length signed 32-bit: varint encoding of "ZigZag"-encoded value (n << 1) XOR (n >> 31)
  • Variable encoding length signed 64-bit: varint encoding of "ZigZag"-encoded (n << 1) XOR (n >> 63)
  • Constant encoding length 32-bit: 32 bits in little-endian 2's complement
  • Constant encoding length 64-bit: 64 bits in little-endian 2's complement
UTF-8-encoded, preceded by varint-encoded integer length of string in bytesRepeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length
Smile \x21
  • True: \x23
  • False: \x22
  • Single byte "small" (values −16 – 15 encoded as \xc0\xdf),
  • zigzag-encoded varints (1–11 data bytes), or BigInteger
IEEE single/double, BigDecimalLength-prefixed "short" Strings (up to 64 bytes), marker-terminated "long" Strings and (optional) back-referencesArbitrary-length heterogenous arrays with end-markerArbitrary-length key/value pairs with end-marker
Structured Data eXchange Formats (SDXF)Big-endian signed 24-bit or 32-bit integerBig-endian IEEE doubleEither UTF-8 or ISO 8859-1 encodedList of elements with identical ID and size, preceded by array header with int16 lengthChunks can contain other chunks to arbitrary depth.
Thrift
  1. Any XML based representation can be compressed, or generated as, using EXI – "Efficient XML Interchange (EXI) Format 1.0 (Second Edition)". [17] – which is a "Schema Informed" (as opposed to schema-required, or schema-less) binary compression standard for XML.
  2. All basic Ion types have a null variant, as its 0xXf tag. Any tag beginning with 0x0X other than 0x0f defines ignored padding.
  3. Interpretation of Netstrings is entirely application- or schema-dependent.

See also

Related Research Articles

In computing, serialization is the process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed later. When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object. For many complex objects, such as those that make extensive use of references, this process is not straightforward. Serialization of object-oriented objects does not include any of their associated methods with which they were previously linked.

<span class="mw-page-title-main">XML</span> Markup language by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

Abstract Syntax Notation One (ASN.1) is a standard interface description language (IDL) for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and especially in cryptography.

YAML(see § History and name) is a human-readable data serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

<span class="mw-page-title-main">Configuration file</span> Software file used to configure the initial settings for a computer program

In computing, configuration files are files used to configure the parameters and initial settings for some computer programs. They are used for user applications, server processes and operating system settings.

Bencode is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.

Various binary formats have been proposed as compact representations for XML. Using a binary XML format generally reduces the verbosity of XML documents thereby also reducing the cost of parsing, but hinders the use of ordinary text editors and third-party tools to view and edit the document. There are several competing formats, but none has yet emerged as a de facto standard, although the World Wide Web Consortium adopted EXI as a Recommendation on 10 March 2011.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a commonly used data format with diverse uses in electronic data interchange, including that of web applications with servers.

In the macOS, iOS, NeXTSTEP, and GNUstep programming frameworks, property list files are files that store serialized objects. Property list files use the filename extension .plist, and thus are often referred to as p-list files.

XML Information Set is a W3C specification describing an abstract data model of an XML document in terms of a set of information items. The definitions in the XML Information Set specification are meant to be used in other specifications that need to refer to the information in a well-formed XML document.

Data exchange is the process of taking data structured under a source schema and transforming it into a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

<span class="mw-page-title-main">Apache Avro</span> Open-source remote procedure call framework

Avro is a row-oriented remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. Avro uses a schema to structure the data that is being encoded. It has two different types of schema languages; one for human editing and another which is more machine-readable based on JSON.

Data Format Description Language, published as an Open Grid Forum Recommendation in February 2021, is a modeling language for describing general text and binary data in a standard way. A DFDL model or schema allows any text or binary data to be read from its native format and to be presented as an instance of an information set.. The same DFDL schema also allows data to be taken from an instance of an information set and written out to its native format.

MessagePack is a computer data interchange format. It is a binary form for representing simple data structures like arrays and associative arrays. MessagePack aims to be as compact and simple as possible. The official implementation is available in a variety of languages such as C, C++, C#, D, Erlang, Go, Haskell, Java, JavaScript (NodeJS), Lua, OCaml, Perl, PHP, Python, Ruby, Rust, Scala, Smalltalk, and Swift.

Universal Binary JSON (UBJSON) is a computer data interchange format. It is a binary form directly imitating JSON, but requiring fewer bytes of data. It aims to achieve the generality of JSON, combined with being much easier to process than JSON.

Smile is a computer data interchange format based on JSON. It can also be considered a binary serialization of the generic JSON data model, which means tools that operate on JSON may be used with Smile as well, as long as a proper encoder/decoder exists for the tool. The name comes from the first 2 bytes of the 4 byte header, which consist of Smiley ":)" followed by a linefeed: a choice made to make it easier to recognize Smile-encoded data files using textual command-line tools.

JData is a light-weight data annotation and exchange open-standard designed to represent general-purpose and scientific data structures using human-readable (text-based) JSON and (binary) UBJSON formats. JData specification specifically aims at simplifying exchange of hierarchical and complex data between programming languages, such as MATLAB, Python, JavaScript etc. It defines a comprehensive list of JSON-compatible "name":value constructs to store a wide range of data structures, including scalars, N-dimensional arrays, sparse/complex-valued arrays, maps, tables, hashes, linked lists, trees and graphs, and support optional data grouping and metadata for each data element. The generated data files are compatible with JSON/UBJSON specifications and can be readily processed by most existing parsers. JData-defined annotation keywords also permit storage of strongly-typed binary data streams in JSON, data compression, linking and referencing.

References

  1. Apache Thrift
  2. Bormann, Carsten (2018-12-26). "CBOR relationship with msgpack". GitHub . Retrieved 2023-08-14.
  3. "Implementations". GitHub .
  4. "HAPI FHIR - The Open Source FHIR API for Java". hapifhir.io.
  5. cpython/Lib/pickle.py
  6. "A Brief History of SOAP". www.xml.com.
  7. Ben-Kiki, Oren; Evans, Clark; Net, Ingy döt (2009-10-01). "YAML Ain't Markup Language (YAML) Version 1.2". The Official YAML Web Site. Retrieved 2012-02-10.
  8. "text_format.h - Protocol Buffers". Google Developers.
  9. "JSON Mapping - Protocol Buffers". Google Developers.
  10. "Avro Json Format".
  11. "NSPropertyListSerialization class documentation". www.gnustep.org. Archived from the original on 2011-05-19. Retrieved 2009-10-28.
  12. "Documentation Archive". developer.apple.com.
  13. Oren Ben-Kiki; Clark Evans; Brian Ingerson (2005-01-18). "Null Language-Independent Type for YAML Version 1.1". YAML.org. Retrieved 2009-09-12.
  14. 1 2 Oren Ben-Kiki; Clark Evans; Brian Ingerson (2005-01-18). "Boolean Language-Independent Type for YAML Version 1.1". YAML.org. Clark C. Evans. Retrieved 2009-09-12.
  15. Oren Ben-Kiki; Clark Evans; Brian Ingerson (2005-02-11). "Integer Language-Independent Type for YAML Version 1.1". YAML.org. Clark C. Evans. Retrieved 2009-09-12.
  16. Oren Ben-Kiki; Clark Evans; Brian Ingerson (2005-01-18). "Floating-Point Language-Independent Type for YAML Version 1.1". YAML.org. Clark C. Evans. Retrieved 2009-09-12.
  17. "Efficient Extensible Interchange".
  18. Ion Binary Encoding
  19. "MessagePack is an extremely efficient object serialization library. It's like JSON, but very fast and small.: msgpack/msgpack". 2 April 2019 via GitHub.