Encoding Control Notation

Last updated

The Encoding Control Notation (ECN) is a standardized formal language that is part of the Abstract Syntax Notation One (ASN.1) family of international standards. [1] ECN is designed to be used along with ASN.1, and each ECN specification (a coherent set of encoding definitions) is explicitly related to a particular ASN.1 specification (a coherent set of type definitions).

Contents

The ECN standard is published by both the ITU-T and the ISO, and is officially named ITU-T Recommendation X.692 | ISO/IEC 8825-3, Information technology – ASN.1 encoding rules: Specification of Encoding Control Notation (ECN). [2]

ECN supports the formal specification of non-standard encoding rules for ASN.1 type definitions, and is intended to be used whenever it is necessary to use encodings that differ from those provided by standardized encoding rules such as BER or PER.

Uses of ECN

An ASN.1 type has a set of abstract values. Encoding rules specify the representation of these abstract values as a series of bits. There are applications in which special encodings, different from those obtainable through any of the standard sets of ASN.1 encoding rules, are required.

Here are some examples of possible situations that require some smaller or larger deviations from the standard encodings:

In the above cases and in many other similar cases, the combined use of ASN.1 and ECN makes it possible to create a full, formal specification of both abstract syntax (schema) and encodings. Encoders and decoders can then be automatically generated from the combined specifications. This is a significant factor in reducing both the amount of work and the possibility of errors in making interoperable systems. Another significant advantage of ECN is the ability to provide automatic tool support for testing. These advantages are available with ASN.1 alone when standardized encoding rules suffice, but ECN provides these advantages in circumstances where the standardized encoding rules are not sufficient.

Overview of ECN

Concepts

To understand how ECN works, it is useful to focus on four kinds of elements of the ASN.1 language: built-in types (e.g., INTEGER and UTF8String), built-in constructor keywords (e.g., SEQUENCE, CHOICE, SEQUENCE OF, OPTIONAL), user-defined simple types (e.g., Age::=INTEGER(0..200), Color::=ENUMERATED{green,yellow,red}), and user-defined complex types (e.g., Name::=SEQUENCE{firstUTF8String,middleUTF8String,lastUTF8String}). There are other aspects of ASN.1 that are also reflected in ECN, but we will not discuss them here.

The ECN language also has built-in types, built-in constructor keywords, user-defined simple types, and user-defined complex types. These elements of the ECN language are similar to those of ASN.1, but their names always begin with a #. Officially they are called encoding classes but here we will simply call them ECN types and ECN constructor keywords. Examples of ECN types are: #INTEGER (built-in), #UTF8String (built-in), #Age (simple user-defined), #Name (complex user-defined). Examples of ECN constructor keywords are: #SEQUENCE, #CHOICE, #SEQUENCE-OF, and #OPTIONAL (all built-in).

Unlike ASN.1, ECN allows the user of the language to define synonyms of ECN constructor keywords (e.g., #InterleavedSequence ::= #SEQUENCE). Therefore, in ECN there are user-defined ECN constructor keywords as well as built-in ECN constructor keywords.

From the ECN viewpoint, every user-defined ASN.1 type occurring in an ASN.1 specification has a hidden ECN type implicitly associated with it. Officially this hidden ECN type is called an implicitly generated encoding structure but here we will simply call it the hidden ECN type of the ASN.1 type. Hidden ECN types are a special kind of user-defined ECN types (their ECN definition is automatically generated from a user-defined ASN.1 type rather than being provided by the user), but they behave like other user-defined ECN types.

The hidden ECN type of an ASN.1 type is almost identical to the original ASN.1 type (but slightly simplified) and is the starting point for an encoding process, specified in ECN, which ultimately generates the series of bits representing any given value of the original ASN.1 type. An ASN.1 type (or any of its parts) is not directly referenceable for the purpose of specifying an encoding in ECN, but its hidden ECN type is. ECN types and ECN constructor keywords can be explicitly referenced within an ECN specification and are encoded by applying the rules contained in the ECN specification.

Roughly speaking, an ECN specification does two things: it says how to modify a hidden ECN type to produce a new (colored; see below) hidden ECN type, and it says how an ECN type (as well as each of its components if it's a complex type) is to be encoded. The latter can be applied recursively, in the sense that an encoding step for a component of an ECN type may result in a further in-place modification of the remaining part of the ECN type that is being encoded. This process can go on through any number of cycles, until the final ECN type has been completely encoded, that is, all the bits representing the value of the original ASN.1 type have been generated.

Lastly we introduce the concept of encoding object. This is a very important element of the ECN language, and refers to each individual encoding rule that is part of an ECN specification and is applied to an ECN type or ECN constructor keyword, either built-in or user-defined, occurring in the specification.

Mechanisms

The first step of the encoding process is the automatic generation of hidden ECN types from all ASN.1 types present in the ASN.1 specification. The hidden ECN types corresponding to complex user-defined ASN.1 types can be modified by a mechanism called coloring, which consists in replacing the names of the types of some of their components with synonyms. It is also possible to replace the ECN built-in constructor keywords (e.g., #SEQUENCE, #OPTIONAL) occurring in a hidden ECN type with synonyms. In ECN there are a few built-in synonyms for both constructor keywords and built-in types (e.g., #CONCATENATION is a synonym of #SEQUENCE, #INT is a synonym of #INTEGER), but a user of the language can define both user-defined types and user-defined constructor keywords as synonyms of others. The purpose of the coloring step is to prepare a hidden ECN type for the next step, which is the encoding of its components, in case it is necessary to encode in a different way different occurrences of the same ECN type or different occurrences of the same ECN constructor keyword present in the hidden ECN type. For example, a complex hidden ECN type might contain two lists (#SEQUENCE-OF), but one list is to be encoded by inserting a count field before the first item of the list, and the other is to be encoded by inserting a terminating pattern after the last item of the list. This can be done, for example, by replacing the first #SEQUENCE-OF keyword in the hidden ECN type with, say, #CountBasedRepetition, by replacing the second #SEQUENCE-OF keyword with, say, #TerminatingPatternBasedRepetition, and by declaring these two names as user-defined synonyms of the ECN constructor keyword #SEQUENCE-OF. Once these two different constructor keywords have been included in the hidden ECN type, each of the two lists can be encoded with a different encoding object.

The second step of the encoding process is the application of an encoding object to a hidden ECN type. The value to be encoded will be one of the possible values of an ASN.1 type defined in the ASN.1 specification, and the encoding process will select the hidden ECN type of that ASN.1 type and will apply the appropriate encoding object to it.

There may be further steps consisting in the recursive application of encoding objects that work by replacing an ECN type (or part of it) with another ECN type.

In ECN there are several kinds of encoding objects. Some encoding objects completely determine the actual bit-level encoding of simple ECN types and are the easiest to understand. Others apply to ECN constructor keywords rather than to ECN types, and determine some structural aspects of the encoding of the complex ECN type (or part of it) constructed by an ECN constructor keyword (but do not specify its entire encoding). Others work by replacing an ECN type (or a part of it) with another ECN type, which must then be encoded by applying a different encoding object to it.

The most important kinds of encoding objects in ECN are listed below:

These encoding objects apply mostly to simple ECN types, and have several parameters specifying the bit-level encoding of a value, the size of the encoding, any preceding or trailing padding, any alignment to an octet or word boundary, any bit reversals, etc.
The replacement type must be specified in the ECN specification, not in the ASN.1 specification. The user-defined ECN type must have a name beginning with a #, which must not be the same as the name of any hidden ECN type.
The replacement type must be specified in the ECN specification, not in the ASN.1 specification. The user-defined ECN type must have a name beginning with a #, which must not be the same as the name of any hidden ECN type.
Here are some typical ways in which these encoding objects can represent the presence of the optional component:
  1. by utilizing a (typically boolean) field whose value indicates presence or absence of the optional component, and which was inserted in the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of the optional component but never occurs within the encodings of any of the types which can come after the optional component according to the ECN specification;
  3. by relying on the size of the enclosing encoding to determine whether the encoding of the optional component will fit in the remaining space.
Here are some typical ways in which these encoding objects can represent the length of a list:
  1. by utilizing a field containing the length of the list, and which was inserted in the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of the repeating component of the list but never occurs within the encodings of any of the types which can come after the list according to the ECN specification;
  3. by relying on the size of the enclosing encoding to determine how many instances of the encoding of the repeating component will fit in the remaining space;
  4. by choosing a bit string that does not match the encoding of any value of the repeating component of the list, and inserting that bit string after the last item of the list;
  5. by utilizing a (typically boolean) field within the repeating component, whose value indicates whether that item is the last item of the list.
Here are some typical ways in which these encoding objects can indicate which of the alternatives of a #CHOICE is present:
  1. by utilizing a field containing the index of the alternative, and which was added to the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of each alternative and is different for each alternative.

Related Research Articles

The Lightweight Directory Access Protocol is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. As examples, directory services may provide any organized set of records, often with a hierarchical structure, such as a corporate email directory. Similarly, a telephone directory is a list of subscribers with an address and a phone number.

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs. Message bodies may consist of multiple parts, and header information may be specified in non-ASCII character sets. Email messages with MIME formatting are typically transmitted with standard protocols, such as the Simple Mail Transfer Protocol (SMTP), the Post Office Protocol (POP), and the Internet Message Access Protocol (IMAP).

UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from UnicodeTransformation Format – 8-bit.

Abstract Syntax Notation One (ASN.1) is a standard interface description language for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and especially in cryptography.

<span class="mw-page-title-main">Data type</span> Attribute of data

In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.

In computer programming, Base64 is a group of binary-to-text encoding schemes that represent binary data in sequences of 24 bits that can be represented by four 6-bit Base64 digits.

YAML(see § History and name) is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax which intentionally differs from Standard Generalized Markup Language (SGML). It uses both Python-style indentation to indicate nesting, and a more compact format that uses [...] for lists and {...} for maps but forbids tab characters to use as indentation thus only some JSON files are valid YAML 1.2.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

This article compares two programming languages: C# with Java. While the focus of this article is mainly the languages and their features, such a comparison will necessarily also consider some features of platforms and libraries. For a more detailed comparison of the platforms, see Comparison of the Java and .NET platforms.

A management information base (MIB) is a database used for managing the entities in a communication network. Most often associated with the Simple Network Management Protocol (SNMP), the term is also used more generically in contexts such as in OSI/ISO Network management model. While intended to refer to the complete collection of management information available on an entity, it is often used to refer to a particular subset, more correctly referred to as MIB-module.

In computer programming, a directive or pragma is a language construct that specifies how a compiler should process its input. Directives are not part of the grammar of a programming language, and may vary from compiler to compiler. They can be processed by a preprocessor to specify compiler behavior, or function as a form of in-band parameterization.

In class-based, object-oriented programming, a constructor is a special type of function called to create an object. It prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.

STEP-file is a widely used data exchange form of STEP. ISO 10303 can represent 3D objects in computer-aided design (CAD) and related information. Due to its ASCII structure, a STEP-file is easy to read, with typically one instance per line. The format of a STEP-file is defined in ISO 10303-21 Clear Text Encoding of the Exchange Structure.

A class in C++ is a user-defined type or data structure declared with keyword class that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class is private. The private members are not accessible outside the class; they can be accessed only through methods of the class. The public members form an interface to the class and are accessible outside the class.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Action Message Format (AMF) is a binary format used to serialize object graphs such as ActionScript objects and XML, or send messages between an Adobe Flash client and a remote service, usually a Flash Media Server or third party alternatives. The Actionscript 3 language provides classes for encoding and decoding from the AMF format.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

ASN.1 Information Object Class is a concept widely used in ASN.1 specifications to address issues related to protocol specification similar to issues addressed by CORBA/IDL specifications.

X.690 is an ITU-T standard specifying several ASN.1 encoding formats:

References

  1. "ITU-T Rec. X.680 / ISO/IEC 8824-1" . Retrieved 2008-08-28.
  2. "ITU-T Rec. X.692 / ISO/IEC 8825-3" . Retrieved 2008-08-28.