Encoding Control Notation

Last updated

The Encoding Control Notation (ECN) is a standardized formal language that is part of the Abstract Syntax Notation One (ASN.1) family of international standards. [1] ECN is designed to be used along with ASN.1, and each ECN specification (a coherent set of encoding definitions) is explicitly related to a particular ASN.1 specification (a coherent set of type definitions).

Contents

The ECN standard is published by both the ITU-T and the ISO, and is officially named ITU-T Recommendation X.692 | ISO/IEC 8825-3, Information technology – ASN.1 encoding rules: Specification of Encoding Control Notation (ECN). [2]

ECN supports the formal specification of non-standard encoding rules for ASN.1 type definitions, and is intended to be used whenever it is necessary to use encodings that differ from those provided by standardized encoding rules such as BER or PER.

Uses of ECN

An ASN.1 type has a set of abstract values. Encoding rules specify the representation of these abstract values as a series of bits. There are applications in which special encodings, different from those obtainable through any of the standard sets of ASN.1 encoding rules, are required.

Here are some examples of possible situations that require some smaller or larger deviations from the standard encodings:

In the above cases and in many other similar cases, the combined use of ASN.1 and ECN makes it possible to create a full, formal specification of both abstract syntax (schema) and encodings. Encoders and decoders can then be automatically generated from the combined specifications. This is a significant factor in reducing both the amount of work and the possibility of errors in making interoperable systems. Another significant advantage of ECN is the ability to provide automatic tool support for testing. These advantages are available with ASN.1 alone when standardized encoding rules suffice, but ECN provides these advantages in circumstances where the standardized encoding rules are not sufficient.

Overview of ECN

Concepts

To understand how ECN works, it is useful to focus on four kinds of elements of the ASN.1 language: built-in types (e.g., INTEGER and UTF8String), built-in constructor keywords (e.g., SEQUENCE, CHOICE, SEQUENCE OF, OPTIONAL), user-defined simple types (e.g., Age::=INTEGER(0..200), Color::=ENUMERATED{green,yellow,red}), and user-defined complex types (e.g., Name::=SEQUENCE{firstUTF8String,middleUTF8String,lastUTF8String}). There are other aspects of ASN.1 that are also reflected in ECN, but we will not discuss them here.

The ECN language also has built-in types, built-in constructor keywords, user-defined simple types, and user-defined complex types. These elements of the ECN language are similar to those of ASN.1, but their names always begin with a #. Officially they are called encoding classes but here we will simply call them ECN types and ECN constructor keywords. Examples of ECN types are: #INTEGER (built-in), #UTF8String (built-in), #Age (simple user-defined), #Name (complex user-defined). Examples of ECN constructor keywords are: #SEQUENCE, #CHOICE, #SEQUENCE-OF, and #OPTIONAL (all built-in).

Unlike ASN.1, ECN allows the user of the language to define synonyms of ECN constructor keywords (e.g., #InterleavedSequence ::= #SEQUENCE). Therefore, in ECN there are user-defined ECN constructor keywords as well as built-in ECN constructor keywords.

From the ECN viewpoint, every user-defined ASN.1 type occurring in an ASN.1 specification has a hidden ECN type implicitly associated with it. Officially this hidden ECN type is called an implicitly generated encoding structure but here we will simply call it the hidden ECN type of the ASN.1 type. Hidden ECN types are a special kind of user-defined ECN types (their ECN definition is automatically generated from a user-defined ASN.1 type rather than being provided by the user), but they behave like other user-defined ECN types.

The hidden ECN type of an ASN.1 type is almost identical to the original ASN.1 type (but slightly simplified) and is the starting point for an encoding process, specified in ECN, which ultimately generates the series of bits representing any given value of the original ASN.1 type. An ASN.1 type (or any of its parts) is not directly referenceable for the purpose of specifying an encoding in ECN, but its hidden ECN type is. ECN types and ECN constructor keywords can be explicitly referenced within an ECN specification and are encoded by applying the rules contained in the ECN specification.

Roughly speaking, an ECN specification does two things: it says how to modify a hidden ECN type to produce a new (colored; see below) hidden ECN type, and it says how an ECN type (as well as each of its components if it's a complex type) is to be encoded. The latter can be applied recursively, in the sense that an encoding step for a component of an ECN type may result in a further in-place modification of the remaining part of the ECN type that is being encoded. This process can go on through any number of cycles, until the final ECN type has been completely encoded, that is, all the bits representing the value of the original ASN.1 type have been generated.

Lastly we introduce the concept of encoding object. This is a very important element of the ECN language, and refers to each individual encoding rule that is part of an ECN specification and is applied to an ECN type or ECN constructor keyword, either built-in or user-defined, occurring in the specification.

Mechanisms

The first step of the encoding process is the automatic generation of hidden ECN types from all ASN.1 types present in the ASN.1 specification. The hidden ECN types corresponding to complex user-defined ASN.1 types can be modified by a mechanism called coloring, which consists in replacing the names of the types of some of their components with synonyms. It is also possible to replace the ECN built-in constructor keywords (e.g., #SEQUENCE, #OPTIONAL) occurring in a hidden ECN type with synonyms. In ECN there are a few built-in synonyms for both constructor keywords and built-in types (e.g., #CONCATENATION is a synonym of #SEQUENCE, #INT is a synonym of #INTEGER), but a user of the language can define both user-defined types and user-defined constructor keywords as synonyms of others. The purpose of the coloring step is to prepare a hidden ECN type for the next step, which is the encoding of its components, in case it is necessary to encode in a different way different occurrences of the same ECN type or different occurrences of the same ECN constructor keyword present in the hidden ECN type. For example, a complex hidden ECN type might contain two lists (#SEQUENCE-OF), but one list is to be encoded by inserting a count field before the first item of the list, and the other is to be encoded by inserting a terminating pattern after the last item of the list. This can be done, for example, by replacing the first #SEQUENCE-OF keyword in the hidden ECN type with, say, #CountBasedRepetition, by replacing the second #SEQUENCE-OF keyword with, say, #TerminatingPatternBasedRepetition, and by declaring these two names as user-defined synonyms of the ECN constructor keyword #SEQUENCE-OF. Once these two different constructor keywords have been included in the hidden ECN type, each of the two lists can be encoded with a different encoding object.

The second step of the encoding process is the application of an encoding object to a hidden ECN type. The value to be encoded will be one of the possible values of an ASN.1 type defined in the ASN.1 specification, and the encoding process will select the hidden ECN type of that ASN.1 type and will apply the appropriate encoding object to it.

There may be further steps consisting in the recursive application of encoding objects that work by replacing an ECN type (or part of it) with another ECN type.

In ECN there are several kinds of encoding objects. Some encoding objects completely determine the actual bit-level encoding of simple ECN types and are the easiest to understand. Others apply to ECN constructor keywords rather than to ECN types, and determine some structural aspects of the encoding of the complex ECN type (or part of it) constructed by an ECN constructor keyword (but do not specify its entire encoding). Others work by replacing an ECN type (or a part of it) with another ECN type, which must then be encoded by applying a different encoding object to it.

The most important kinds of encoding objects in ECN are listed below:

These encoding objects apply mostly to simple ECN types, and have several parameters specifying the bit-level encoding of a value, the size of the encoding, any preceding or trailing padding, any alignment to an octet or word boundary, any bit reversals, etc.
The replacement type must be specified in the ECN specification, not in the ASN.1 specification. The user-defined ECN type must have a name beginning with a #, which must not be the same as the name of any hidden ECN type.
The replacement type must be specified in the ECN specification, not in the ASN.1 specification. The user-defined ECN type must have a name beginning with a #, which must not be the same as the name of any hidden ECN type.
Here are some typical ways in which these encoding objects can represent the presence of the optional component:
  1. by utilizing a (typically boolean) field whose value indicates presence or absence of the optional component, and which was inserted in the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of the optional component but never occurs within the encodings of any of the types which can come after the optional component according to the ECN specification;
  3. by relying on the size of the enclosing encoding to determine whether the encoding of the optional component will fit in the remaining space.
Here are some typical ways in which these encoding objects can represent the length of a list:
  1. by utilizing a field containing the length of the list, and which was inserted in the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of the repeating component of the list but never occurs within the encodings of any of the types which can come after the list according to the ECN specification;
  3. by relying on the size of the enclosing encoding to determine how many instances of the encoding of the repeating component will fit in the remaining space;
  4. by choosing a bit string that does not match the encoding of any value of the repeating component of the list, and inserting that bit string after the last item of the list;
  5. by utilizing a (typically boolean) field within the repeating component, whose value indicates whether that item is the last item of the list.
Here are some typical ways in which these encoding objects can indicate which of the alternatives of a #CHOICE is present:
  1. by utilizing a field containing the index of the alternative, and which was added to the ECN type by another encoding object applied at an earlier stage;
  2. by relying on a particular bit pattern that occurs at certain precise bit locations within the encodings of all the possible values of each alternative and is different for each alternative.

Related Research Articles

The Lightweight Directory Access Protocol is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. As examples, directory services may provide any organized set of records, often with a hierarchical structure, such as a corporate email directory. Similarly, a telephone directory is a list of subscribers with an address and a phone number.

Java and C++ are two prominent object-oriented programming languages. By many language popularity metrics, the two languages have dominated object-oriented and high-performance software development for much of the 21st century, and are often directly compared and contrasted. Java's syntax was based on C/C++.

Abstract Syntax Notation One (ASN.1) is a standard interface description language (IDL) for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and especially in cryptography.

<span class="mw-page-title-main">Data type</span> Attribute of data

In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.

The Vienna Development Method (VDM) is one of the longest-established formal methods for the development of computer-based systems. Originating in work done at the IBM Laboratory Vienna in the 1970s, it has grown to include a group of techniques and tools based on a formal specification language—the VDM Specification Language (VDM-SL). It has an extended form, VDM++, which supports the modeling of object-oriented and concurrent systems. Support for VDM includes commercial and academic tools for analyzing models, including support for testing and proving properties of models and generating program code from validated VDM models. There is a history of industrial usage of VDM and its tools and a growing body of research in the formalism has led to notable contributions to the engineering of critical systems, compilers, concurrent systems and in logic for computer science.

A Universally Unique Identifier (UUID) is a 128-bit label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

This article compares two programming languages: C# with Java. While the focus of this article is mainly the languages and their features, such a comparison will necessarily also consider some features of platforms and libraries.

A management information base (MIB) is a database used for managing the entities in a communication network. Most often associated with the Simple Network Management Protocol (SNMP), the term is also used more generically in contexts such as in OSI/ISO Network management model. While intended to refer to the complete collection of management information available on an entity, it is often used to refer to a particular subset, more correctly referred to as MIB-module.

In class-based, object-oriented programming, a constructor is a special type of function called to create an object. It prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.

In public key infrastructure (PKI) systems, a certificate signing request is a message sent from an applicant to a certificate authority of the public key infrastructure (PKI) in order to apply for a digital identity certificate. The CSR usually contains the public key for which the certificate should be issued, identifying information and a proof of authenticity including integrity protection. The most common format for CSRs is the PKCS #10 specification; others include the more capable Certificate Request Message Format (CRMF) and the SPKAC format generated by some web browsers.

STEP-file is a widely used data exchange form of STEP. ISO 10303 can represent 3D objects in computer-aided design (CAD) and related information. Due to its ASCII structure, a STEP-file is easy to read, with typically one instance per line. The format of a STEP-file is defined in ISO 10303-21 Clear Text Encoding of the Exchange Structure.

Fast Infoset is an international standard that specifies a binary encoding format for the XML Information Set as an alternative to the XML document format. It aims to provide more efficient serialization than the text-based XML format.

SystemVerilog, standardized as IEEE 1800, is a hardware description and hardware verification language used to model, design, simulate, test and implement electronic systems. SystemVerilog is based on Verilog and some extensions, and since 2008, Verilog is now part of the same IEEE standard. It is commonly used in the semiconductor and electronic design industry as an evolution of Verilog.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior version of the C++ standard, named C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

This is an overview of Fortran 95 language features. Included are the additional features of TR-15581:Enhanced Data Type Facilities, which have been universally implemented. Old features that have been superseded by new ones are not described – few of those historic features are used in modern programs although most have been retained in the language to maintain backward compatibility. The additional features of subsequent standards, up to Fortran 2023, are described in the Fortran 2023 standard document, ISO/IEC 1539-1:2023. Many of its new features are still being implemented in compilers.

ASN.1 Information Object Class is a concept widely used in ASN.1 specifications to address issues related to protocol specification similar to issues addressed by CORBA/IDL specifications.

X.690 is an ITU-T standard specifying several ASN.1 encoding formats:

References

  1. "ITU-T Rec. X.680 / ISO/IEC 8824-1" . Retrieved 2008-08-28.
  2. "ITU-T Rec. X.692 / ISO/IEC 8825-3" . Retrieved 2008-08-28.