Assembly (CLI)

Last updated

Defined by Microsoft for use in recent versions of Windows, an assembly in the Common Language Infrastructure (CLI) is a compiled code library used for deployment, versioning, and security. There are two types: process assemblies (EXE) and library assemblies (DLL). A process assembly represents a process that will use classes defined in library assemblies. CLI assemblies contain code in CIL, which is usually generated from a CLI language, and then compiled into machine language at run time by the just-in-time compiler. In the .NET Framework implementation, this compiler is part of the Common Language Runtime (CLR).

Contents

An assembly can consist of one or more files. Code files are called modules. An assembly can contain more than one code module. And since it is possible to use different languages to create code modules, it is technically possible to use several different languages to create an assembly. Visual Studio however does not support using different languages in one assembly.

Assembly names

The name of an assembly consists of four parts

  1. The short name. On Windows this is the name of the Portable Executable (PE) file without the extension.
  2. The culture. This is an RFC 1766 identifier of the locale for the assembly. In general, library and process assemblies should be culture neutral; the culture should only be used for satellite assemblies.
  3. The version. This is a dotted number made up of four values – major, minor, build and revision.
  4. A public key token. This is a 64-bit hash of the public key that corresponds to the private key used to sign [1] the assembly. A signed assembly is said to have a strong name.

The public key token is used to make the assembly name unique. Thus, two strong named assemblies can have the same PE file name and yet the CLI will recognize them as different assemblies. The Windows file system (FAT32 and NTFS) only recognizes the PE file name, so two assemblies with the same PE file name (but different culture, version or public key token) cannot exist in the same Windows folder. To solve this issue, the CLI introduces the GAC (Global Assembly Cache) that is treated as a single folder by run-time, but is actually implemented using nested file system folders.

To prevent spoofing attacks, where a cracker would try to pass off an assembly appearing as something else, the assembly is signed with a private key. The developer of the intended assembly keeps the private key secret, so a cracker cannot have access to it nor simply guess it. Thus the cracker cannot make his assembly impersonate something else, lacking the possibility to sign it correctly after the change. Signing the assembly involves taking a hash of important parts of the assembly and then encrypting the hash with the private key. The signed hash is stored in the assembly along with the public key. The public key will decrypt the signed hash. When the CLR loads a strongly named assembly it will generate a hash from the assembly and then compare this with the decrypted hash. If the comparison succeeds then it means that the public key in the file (and hence the public key token) is associated with the private key used to sign the assembly. This will mean that the public key in the assembly is the public key of the assembly publisher and hence a spoofing attack is prevented.

Assembly versions

CLI assemblies can have version information, allowing them to eliminate most conflicts between applications caused by shared assemblies. [2] However, this does not eliminate all possible versioning conflicts between assemblies. [3]

Assemblies and CLI security

CLI Code Access Security is based on assemblies and evidence. Evidence can be anything deduced from the assembly, but typically it is created from the source of the assembly – whether the assembly was downloaded from the Internet, an intranet, or installed on the local machine (if the assembly is downloaded from another machine it will be stored in a sandboxed location within the GAC and hence is not treated as being installed locally). Permissions are applied to entire assemblies, and an assembly can specify the minimum permissions it requires through custom attributes (see CLI metadata). When the assembly is loaded the CLR will use the evidence for the assembly to create a permission set of one or more code access permissions. The CLR will then check to make sure that this permission set contains the required permissions specified by the assembly.

CLI code can perform a code access security demand. This means that the code will perform some privileged action only if all of the assemblies of all of the methods in the call stack have the specified permission. If one assembly does not have the permission a security exception is thrown.

The CLI code can also perform Linked Demand for getting the permission from the call stack. In this case the CLR will look at only one method in the call stack in the TOP position for the specified permission. Here the stack walk-through is bound to one method in the call stack by which the CLR assumes that all the other methods in the CALL STACK have the specified permission. The Assembly is a combination of METADATA and MSIL file.

Satellite assemblies

In general, assemblies should contain culture-neutral resources. If you want to localize your assembly (for example use different strings for different locales) you should use satellite assemblies – special, resource-only assemblies. As the name suggests, a satellite is associated with an assembly called the main assembly. That assembly (say, lib.dll) will contain the neutral resources (that Microsoft says is International English, but implies to be US English). Each satellite has the name of the associated library appended with .resources (for example lib.resources.dll). The satellite is given a non-neutral culture name, but since this is ignored by existing Windows file systems (FAT32 and NTFS) this would mean that there could be several files with the same PE name in one folder. Since this is not possible, satellites must be stored in subfolders under the application folder. For example, a satellite with the UK English resources will have a CLI name of "lib.resources Version=0.0.0.0 Culture=en-GB PublicKeyToken=null", a PE file name of lib.resources.dll, and will be stored in a subfolder called en-GB.

Satellites are loaded by a CLI class called System.Resources.ResourceManager. The developer has to provide the name of the resource and information about the main assembly (with the neutral resources). The ResourceManager class will read the locale of the machine and use this information and the name of the main assembly to get the name of the satellite and the name of the subfolder that contains it. ResourceManager can then load the satellite and obtain the localized resource.

Referencing assemblies

One can reference an executable code library by using the /reference flag of the C# compiler.

Delay-signing of an assembly

The shared assemblies need to give a strong name for uniquely identifying the assembly that might be shared among the applications. The strong naming consists of the public key token, culture, version and PE file name. If an assembly is likely to be used for the development purpose which is a shared assembly, the strong naming procedure contains only public key generation. The private key is not generated at that time. It is generated only when the assembly is deployed.

Language of an assembly

The assembly is built up with the CIL code, which is an intermediate language. The framework internally converts the CIL (bytecode) into native assembly code. If we have a program that prints "Hello World", the equivalent CIL code for the method is:

.methodprivatehidebysigstaticvoidMain(string[]args)cilmanaged{.entrypoint.custominstancevoid[mscorlib]System.STAThreadAttribute::.ctor()=(01000000)// Code size       11 (0xb).maxstack1IL_0000:ldstr"Hello World"IL_0005:callvoid[mscorlib]System.Console::WriteLine(string)IL_000a:ret}// end of method Class1::Main

The CIL code loads the String onto the stack, then calls the WriteLine function and returns.

See also

Related Research Articles

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CLI-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

In computing, DLL Hell is a term for the complications that arise when one works with dynamic-link libraries (DLLs) used with Microsoft Windows operating systems, particularly legacy 16-bit editions, which all run in a single memory space.

The Portable Executable (PE) format is a file format for executables, object code, DLLs and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS, MUI and other file types. The Unified Extensible Firmware Interface (UEFI) specification states that PE is the standard executable format in EFI environments.

<span class="mw-page-title-main">Windows Registry</span> Database for Microsoft Windows

The Windows Registry is a hierarchical database that stores low-level settings for the Microsoft Windows operating system and for applications that opt to use the registry. The kernel, device drivers, services, Security Accounts Manager, and user interfaces can all use the registry. The registry also allows access to counters for profiling system performance.

Win32/Simile is a metamorphic computer virus written in assembly language for Microsoft Windows. The virus was released in its most recent version in early March 2002. It was written by the virus writer "Mental Driller". Some of his previous viruses, such as Win95/Drill, have proved very challenging to detect.

The Global Assembly Cache (GAC) is a machine-wide CLI assembly cache for the Common Language Infrastructure (CLI) in Microsoft's .NET Framework. The approach of having a specially controlled central repository addresses the flaws in the shared library concept and helps to avoid pitfalls of other solutions that led to drawbacks like DLL hell.

Managed Extensions for C++ or Managed C++ is a now-deprecated set of language extensions for C++, including grammatical and syntactic extensions, keywords and attributes, to bring the C++ syntax and language to the .NET Framework. These extensions were created by Microsoft to allow C++ code to be targeted to the Common Language Runtime (CLR) in the form of managed code, as well as continue to interoperate with native code.

Extension conflicts were sometimes a common nuisance on Apple Macintosh computers running the classic Mac OS, especially System 7. Extensions were bundles of code that extended the operating system's capabilities by directly patching OS calls, thus receiving control instead of the operating system when applications made system calls. Generally, once an extension completed its task, it was supposed to pass on the system call to the operating system's routine. If multiple extensions want to patch the same system call, they end up receiving the call in a chain, the first extension in line passing it on to the next, and so on in the order they are loaded, until the last extension passes to the operating system. If an extension does not hand the next extension in line what it is expecting, problems occur; ranging from unexpected behavior to full system crashes. This is triggered by several factors such as carelessly programmed and malicious extensions that change or disrupt the way part of the system software works.

C++/CLI is a variant of the C++ programming language, modified for Common Language Infrastructure. It has been part of Visual Studio 2005 and later, and provides interoperability with other .NET languages such as C#. Microsoft created C++/CLI to supersede Managed Extensions for C++. In December 2005, Ecma International published C++/CLI specifications as the ECMA-372 standard.

Metadata, in the Common Language Infrastructure (CLI), refers to certain data structures embedded within the Common Intermediate Language (CIL) code that describes the high-level structure of the code. Metadata describes all classes and class members that are defined in the assembly, and the classes and class members that the current assembly will call from another assembly. The metadata for a method contains the complete description of the method, including the class, the return type and all of the method parameters.

Code Access Security (CAS), in the Microsoft .NET framework, is Microsoft's solution to prevent untrusted code from performing privileged actions. When the CLR loads an assembly it will obtain evidence for the assembly and use this to identify the code group that the assembly belongs to. A code group contains a permission set. Code that performs a privileged action will perform a code access demand which will cause the CLR to walk up the call stack and examine the permission set granted to the assembly of each method in the call stack. The code groups and permission sets are determined by the administrator of the machine who defines the security policy. Microsoft considers CAS as obsolete and discourages its use. It is also not available in .NET Core and .NET.

In NeXTSTEP, OPENSTEP, and their lineal descendants macOS, iOS, iPadOS, tvOS, and watchOS, and in GNUstep, a bundle is a file directory with a defined structure and file extension, allowing related files to be grouped together as a conceptually single item.

Dynamic-link library (DLL) is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX, or DRV . The file formats for DLLs are the same as for Windows EXE files – that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.

As the next version of Windows NT after Windows 2000, as well as the successor to Windows Me, Windows XP introduced many new features but it also removed some others.

Platform Invocation Services, commonly referred to as P/Invoke, is a feature of Common Language Infrastructure implementations, like Microsoft's Common Language Runtime, that enables managed code to call native code.

C# and Visual Basic .NET are the two primary languages used to program on the .NET Framework.

Windows File Protection (WFP), a sub-system included in Microsoft Windows operating systems of the Windows 2000 and Windows XP era, aims to prevent programs from replacing critical Windows system files. Protecting core system files mitigates problems such as DLL hell with programs and the operating system. Windows 2000, Windows XP and Windows Server 2003 include WFP under the name of Windows File Protection; Windows Me includes it as System File Protection (SFP).

Windows Resource Protection is a feature first introduced in Windows Vista and Windows Server 2008. It is available in all subsequent Windows operating systems, and replaces Windows File Protection. Windows Resource Protection prevents the replacement of critical system files, registry keys and folders. Protecting these resources prevents system crashes. The way it protects resources differs entirely from the method used by Windows File Protection.

Side-by-side assembly technology is a standard for executable files in Windows 98 Second Edition, Windows 2000, and later versions of Windows that attempts to alleviate problems that arise from the use of dynamic-link libraries (DLLs) in Microsoft Windows. Such problems include version conflicts, missing DLLs, duplicate DLLs, and incorrect or missing registration. In side-by-side, Windows stores multiple versions of a DLL in the %systemroot%\WinSxS directory, and loads them on demand. This reduces dependency problems for applications that include a side-by-side manifest.

<span class="mw-page-title-main">.NET Framework</span> Software platform developed by Microsoft

The .NET Framework is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until being superseded by the cross-platform .NET project. It includes a large class library called Framework Class Library (FCL) and provides language interoperability across several programming languages. Programs written for .NET Framework execute in a software environment named the Common Language Runtime (CLR). The CLR is an application virtual machine that provides services such as security, memory management, and exception handling. As such, computer code written using .NET Framework is called "managed code". FCL and CLR together constitute the .NET Framework.

References

  1. "Giving a .NET Assembly a Strong Name". Archived from the original on 24 February 2012. Retrieved 29 March 2007.
  2. Truche, Philippe (12 August 2008). ".NET Assembly Versioning Lifecycle". Archived from the original on 24 October 2008. Retrieved 21 September 2008.
  3. Pierson, Harry (17 September 2008). "DLR Namespace Change Fire Drill". Archived from the original on 1 November 2008. Retrieved 21 September 2008.