Metadata (CLI)

Last updated

CLI metadata, on-disk representation
Filename extension
.exe, .dll, .winmd
Magic number 0x424A5342
Developed by Microsoft,
Ecma International
Standard ECMA-335 part II

Metadata, in the Common Language Infrastructure (CLI), refers to certain data structures embedded within the Common Intermediate Language (CIL) code that describes the high-level structure of the code. Metadata describes all classes and class members that are defined in the assembly, and the classes and class members that the current assembly will call from another assembly. The metadata for a method contains the complete description of the method, including the class (and the assembly that contains the class), the return type and all of the method parameters.

Contents

A CLI language compiler will generate the metadata and store this in the assembly containing the CIL. When the run-time executes CIL it will check to make sure that the metadata of the called method is the same as the metadata that is stored in the calling method. This ensures that a method can only be called with exactly the right number of parameters and exactly the right parameter types.

The Windows Runtime application platform, present in Windows 8 and Windows Phone 8, makes use of the CLI metadata format to describe component interfaces for code written in any of the supported programming languages. A difference in use within the Common Language Runtime is that an assembly typically does not contain any CIL instructions. [1]

Attributes

Developers can add metadata to their code through attributes. There are two types of attributes, custom and pseudo custom attributes, and to the developer these have the same syntax. Attributes in code are messages to the compiler to generate metadata. In CIL, metadata such as inheritance modifiers, scope modifiers, and almost anything that isn't either opcodes or streams, are also referred to as attributes.

A custom attribute is a regular class that inherits from the Attribute class. A custom attribute can be used on any method, property, class or entire assembly with the syntax: [AttributeName(optional parameter, optional name=value pairs)] as in:

[Custom][Custom(1)][Custom(1, Comment="yes")]

Custom attributes are used by CLI extensively. Windows Communication Framework uses attributes to define service contracts, ASP.NET uses these to expose methods as web services, LINQ to SQL uses them to define the mapping of classes to the underlying relational schema, Visual Studio uses them to group together properties of an object, the class developer indicates the category for the object's class by applying the [Category] custom attribute. Custom attributes are interpreted by application code and not the CLR. When the compiler sees a custom attribute it will generate custom metadata that is not recognised by the CLR. The developer has to provide code to read the metadata and act on it. As an example, the attribute shown in the example can be handled by the code:

classCustomAttribute:Attribute{privateintparamNumber=0;privatestringcomment="";publicCustomAttribute(){}publicCustomAttribute(intnum){paramNumber=num;}publicStringComment{set{comment=value;}}}

The name of the class is mapped to the attribute name. The Visual C# compiler automatically adds the string "Attribute" at the end of any attribute name. Consequently, every attribute class name should end with this string, but it is legal to define an attribute without the Attribute-suffix. When affixing an attribute to an item, the compiler will look for both the literal name and the name with Attribute added to the end, i. e. if you were to write [Custom] the compiler would look for both Custom and CustomAttribute. If both exist, the compiler fails. The attribute can be prefixed with "@" if you don't want to risk ambiguity, so writing [@Custom] will not match CustomAttribute. Using the attribute invokes the constructor of the class. Overloaded constructors are supported. Name-Value pairs are mapped to properties, the name denotes the name of the property and the value supplied is set by the property.

Sometimes there is ambiguity concerning to what you are affixing the attribute. Consider the following code:

[Orange]publicintExampleMethod(stringinput){//method body goes here}

What has been marked as orange? Is it the ExampleMethod, its return value, or perhaps the entire assembly? In this case, the compiler will default, and treat the attribute as being affixed to the method. If this is not what was intended, or if the author wishes to clarify their code, an attribute target may be specified. Writing [return: Orange] will mark the return value as orange, [assembly: Orange] will mark the entire assembly. The valid targets are assembly, field, event, method, module, param, property, return and type.

A pseudo-custom attribute is used just like regular custom attributes but they do not have a custom handler; rather the compiler has intrinsic awareness of the attributes and handles the code marked with such attributes differently. Attributes such as Serializable and Obsolete are implemented as pseudo-custom attributes. Pseudo-custom attributes should never be used by ILAsm, as it has adequate syntax to describe the metadata.[ clarification needed ]

Metadata storage

Assemblies contain tables of metadata. These tables are described by the CIL specification. The metadata tables will have zero or more entries and the position of an entry determines its index. When CIL code uses metadata it does so through a metadata token. This is a 32-bit value where the top 8 bits identify the appropriate metadata table, and the remaining 24 bits give the index of the metadata in the table. The Framework SDK contains a sample called metainfo that will list the metadata tables in an assembly, however, this information is rarely of use to a developer. Metadata in an assembly may be viewed using the ILDASM tool provided by the .NET Framework SDK.

In the CIL standard, metadata is defined in ILAsm (assembly language) form, an on-disk representation form for storage, and a form that is embedded into assemblies of the Portable Executable (PE, .exe or .dll) format. The PE form is based on the on-disk form.

Reflection

Reflection is the API used to read CLI metadata. The reflection API provides a logical view of metadata rather than the literal view provided by tools like metainfo. Reflection in version 1.1 of the .NET framework can be used to inspect the descriptions of classes and their members, and invoke methods. However, it does not allow runtime access to the CIL for a method. Version 2.0 of the framework allows the CIL for a method to be obtained.

Other metadata tools

Besides the System.Reflection namespace, other tools are also available that can be used to handle metadata. The Microsoft .NET Framework ships a CLR metadata manipulation library that is implemented in native code. Third-party tools to retrieve and manipulate metadata include PostSharp and Mono Cecil can also be used.

See also

Related Research Articles

The Common Language Infrastructure (CLI) is an open specification and technical standard originally developed by Microsoft and standardized by ISO and Ecma International that describes executable code and a runtime environment that allows multiple high-level languages to be used on different computer platforms without being rewritten for specific architectures. This implies it is platform agnostic. The .NET Framework, .NET and Mono are implementations of the CLI. The metadata format is also used to specify the API definitions exposed by the Windows Runtime.

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CLI-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

JScript .NET is a .NET programming language developed by Microsoft.

Managed Extensions for C++ or Managed C++ is a now-deprecated set of language extensions for C++, including grammatical and syntactic extensions, keywords and attributes, to bring the C++ syntax and language to the .NET Framework. These extensions were created by Microsoft to allow C++ code to be targeted to the Common Language Runtime (CLR) in the form of managed code, as well as continue to interoperate with native code.

Defined by Microsoft for use in recent versions of Windows, an assembly in the Common Language Infrastructure (CLI) is a compiled code library used for deployment, versioning, and security. There are two types: process assemblies (EXE) and library assemblies (DLL). A process assembly represents a process that will use classes defined in library assemblies. CLI assemblies contain code in CIL, which is usually generated from a CLI language, and then compiled into machine language at run time by the just-in-time compiler. In the .NET Framework implementation, this compiler is part of the Common Language Runtime (CLR).

C Sharp (programming language) Multi-paradigm (object-oriented) programming language

C# is a general-purpose, multi-paradigm programming language encompassing static typing, strong typing, lexically scoped, imperative, declarative, functional, generic, object-oriented (class-based), and component-oriented programming disciplines.

In the Java computer programming language, an annotation is a form of syntactic metadata that can be added to Java source code. Classes, methods, variables, parameters and Java packages may be annotated. Like Javadoc tags, Java annotations can be read from source files. Unlike Javadoc tags, Java annotations can also be embedded in and read from Java class files generated by the Java compiler. This allows annotations to be retained by the Java virtual machine at run-time and read via reflection. It is possible to create meta-annotations out of the existing ones in Java.

Platform Invocation Services, commonly referred to as P/Invoke, is a feature of Common Language Infrastructure implementations, like Microsoft's Common Language Runtime, that enables managed code to call native code.

C# and Visual Basic .NET are the two primary languages used to program on the .NET Framework.

In computing, an attribute is a specification that defines a property of an object, element, or file. It may also refer to or set the specific value for a given instance of such. For clarity, attributes should more correctly be considered metadata. An attribute is frequently and generally a property of a property. However, in actual usage, the term attribute can and is often treated as equivalent to a property depending on the technology being discussed. An attribute of an object usually consists of a name and a value; of an element, a type or class name; of a file, a name and extension.

Visual Studio Tools for Office (VSTO) is a set of development tools available in the form of a Visual Studio add-in and a runtime that allows Microsoft Office 2003 and later versions of Office applications to host the .NET Framework Common Language Runtime (CLR) to expose their functionality via .NET.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

Component Object Model (COM) is a binary-interface standard for software components introduced by Microsoft in 1993. It is used to enable inter-process communication object creation in a large range of programming languages. COM is the basis for several other Microsoft technologies and frameworks, including OLE, OLE Automation, Browser Helper Object, ActiveX, COM+, DCOM, the Windows shell, DirectX, UMDF and Windows Runtime. The essence of COM is a language-neutral way of implementing objects that can be used in environments different from the one in which they were created, even across machine boundaries. For well-authored components, COM allows reuse of objects with no knowledge of their internal implementation, as it forces component implementers to provide well-defined interfaces that are separated from the implementation. The different allocation semantics of languages are accommodated by making objects responsible for their own creation and destruction through reference-counting. Type conversion casting between different interfaces of an object is achieved through the QueryInterface method. The preferred method of "inheritance" within COM is the creation of sub-objects to which method "calls" are delegated.

.NET Framework Software platform developed by Microsoft

The .NET Framework is a software framework developed by Microsoft that runs primarily on Microsoft Windows. It includes a large class library called Framework Class Library (FCL) and provides language interoperability across several programming languages. Programs written for .NET Framework execute in a software environment named the Common Language Runtime (CLR). The CLR is an application virtual machine that provides services such as security, memory management, and exception handling. As such, computer code written using .NET Framework is called "managed code". FCL and CLR together constitute the .NET Framework.

Windows Runtime (WinRT) is a platform-agnostic application and component architecture first introduced with Windows 8 and Windows Server 2012 in 2012. WinRT supports development in C++/WinRT, C++/CX, Rust/WinRT, Python via the winrt package, JavaScript-TypeScript, and CLI languages such as C# and Visual Basic (VB.NET). WinRT applications natively support both the x86 and ARM processors however native apps need to be separately compiled for each architecture, and may run inside a sandboxed environment to allow greater security and stability. WinRT components are designed with interoperability among multiple languages and APIs in mind, including native, managed and scripting languages.

C++/CX(C++ component extensions) is a language projection for Microsoft's Windows Runtime platform. It takes the form of a language extension for C++ compilers, and it enables C++ programmers to write programs that call Windows Runtime (WinRT) APIs. C++/CX is superseded by the C++/WinRT language projection, which is not an extension to the C++ language; rather, it's an entirely standard modern ISO C++17 header-file-based library.

Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. It was the main programming language supported by Apple for macOS, iOS, and their respective application programming interfaces (APIs), Cocoa and Cocoa Touch, until the introduction of Swift in 2014.

The Standard Libraries is a set of libraries included into the Common Language Infrastructure (CLI) in order to encapsulate many common functions, such as file reading and writing, XML document manipulation, exception handling, application globalization, network communication, threading and reflection, which makes the programmer's job easier. It is much larger in scope than standard libraries for most other languages, including C++, and is comparable in scope and coverage to the standard libraries of Java.

References

  1. "Windows Metadata (WinMD) files". Windows UWP applications.