Cosmos (operating system)

Last updated
Cosmos
Cosmos logo.png
AuraOS.png
Screenshot of an OS made with Cosmos, AuraOS, presenting a GUI creation possibility
Developer Cosmos Project
Written in C#, X#
Working stateActive
Source model Open source
Latest release Release 20221121 / 21 November 2022;22 months ago (2022-11-21)
Repository github.com/CosmosOS
Available in English
Platforms x86
Kernel type Monolithic
License BSD
Official website www.gocosmos.org

C# Open Source Managed Operating System (Cosmos) is a toolkit for building GUI and command-line based operating systems, written mostly in the programming language C# and small amounts of a high-level assembly language named X#. Cosmos is a backronym, [1] in that the acronym was chosen before the meaning. It is open-source software released under a BSD license.

Contents

As of 2022, Cosmos encompasses an ahead-of-time (AOT) compiler named IL2CPU to translate Common Intermediate Language (CIL) into native instructions. Cosmos compiles user-made programs and associated libraries using IL2CPU to create a bootable native executable that can run independently. The resulting output can be booted from a USB flash drive, CD-ROM, over a network via Preboot Execution Environment (PXE), or inside a virtual machine. Recent releases also allow deploying to certain x86 embedded devices over Universal Serial Bus (USB). While C# is the primary language used by developers (both on the backend and by end users of Cosmos), many CLI languages can be used, provided they compile to pure CIL without the use of Platform Invocation Services (P/Invokes). Cosmos is mainly intended for use with .NET.

Cosmos does not aim to become a full operating system, but rather a toolkit to allow other developers to simply and easily build their own operating systems using .NET. It also functions as an abstraction layer, hiding much of the inner workings of the hardware from the eventual developer.

Older versions of Cosmos were released in Milestones, with the last being Milestone 5 (released August 2010). More recently, the project switched to simply naming new releases after the latest commit number.

Releases of Cosmos are divided into two types: the Userkit, and the Devkit. The Userkit is a pre-packaged release that is updated irregularly, as new and improved features are added. Userkits are generally considered stable, but do not include recent changes and may lack features. The Devkits, which refers to the source code of Cosmos, are usually stable but may have some bugs. They can be acquired on GitHub and must be built manually. [1] Git is used for source control management.

Most work on Cosmos is currently aimed at improving debugger functionality and Microsoft Visual Studio integration. Kernel work is focused on implementing file systems, memory management, and developing a reliable network interface. Limine serves as the project's bootloader; in older versions of the toolkit, GRUB was used instead. [2]

Origin

The idea for Cosmos was created by Chad Hower and was initially co-developed by Hower and Matthijs ter Woord. Over time, Cosmos has been maintained and improved by many other individuals.

Developing with Cosmos

Cosmos has many facilities to improve the experience of developing operating systems, and is designed to make the process as fast and painless as possible. Knowledge of assembly language is not required to use Cosmos.

Visual Studio integration

A key feature of Cosmos, which separates it from other operating systems of its type, is its tight integration with Microsoft Visual Studio. Code can be written, compiled, debugged, and run entirely through Visual Studio, with only a few keypresses. Cosmos no longer supports Visual Studio 2015, Visual Studio 2017, or Visual Studio 2019, only supporting Visual Studio 2022.

Debugging

Cosmos can be seamlessly debugged through Visual Studio when running over PXE or in a virtual machine. Many standard debugging features are present, such as breakpoints, tracing, and logging. Also, debugging can be done via serial cables, if running on physical hardware. When running in VMWare, Cosmos supports stepping and breakpoints, even while an operating system is running.

Running

Cosmos uses virtualisation to help speed development by allowing developers to test their operating systems without having to restart their computers as often. By default, VMware Player is used, due to its ease of use in terms of integration with the project. Other virtualisation environments are supported as well, such as Bochs and Hyper-V. An ISO disk image may also be generated that can be burned to a USB flash drive, CD-ROM, or similar media.

PXE booting is also supported, allowing for remote machines to run Cosmos over a network connection.

Compile process

IL2CPU

To compile .NET CIL into assembly language, Cosmos developers created an ahead-of-time compiler named IL2CPU, designed to parse CIL and output x86 opcodes. (IL To CPU) is an AOT compiler that is written using a Common Intermediate Language compliant language (C#). It translates Common Intermediate Language to machine code.

X#

X# is a low-level programming language designed for the x86 processor architecture as part of Cosmos operating system. It aims to simplify operating system development by incorporating C-like language syntax to assembly language. Initially, X# was used for debugging services in Cosmos. The X# compiler is an open source command-line interface (console) program that parses code lines into tokens, compares them with patterns, and translates matched patterns to intel syntax x86 assembly, typically for the YASM assembler. Early versions of X# operated mostly 1:1 with assembly code, but this is no longer the case.[ clarification needed ]

Syntax

The syntax of X# is straightforward but stricter compared to C.

Comments

X# supports only single-line comments in the C++ style, starting with - //.

Constants

X# allows the definition of named constants declared outside functions. Numeric constants are defined similarly to C++; for example:

consti=0

. Referencing them elsewhere requires a # before the name; for example: - "#i".

  • String constant use single quotes (''). To include a single quote in a string constant, use a backslash (e.g.,'I\'m so happy'). X# strings are null terminated.
  • Hexadecimal constants are prefixed with a dollar sign ($), followed by the constant. ($B8000).
  • Decimal constants are not prefixed but cannot start with 0.
  • Binary and octal constants aren't supported yet.

Labels

Labels in X# function similarly to labels in other assembly languages. The goto mnemonic is used to jump to a label instead of the conventional jump or jmp mnemonic.

CodeLabel1:gotoCodeLabel2:

Namespaces

X# program files must start with a namespace directive. X# lacks a namespace hierarchy, so the current namespace changes with each directive until the file ends. Variables or constants in different namespaces can have the same name, as the namespace is prefixed to the member's name in the assembly output. Namespaces cannot reference each other except through low-level operations.

namespaceFIRST// Everything variable or constant name will be prefixed with FIRST and an underscore. Hence the true full name of the below variable// is FIRST_aVar.varaVarnamespaceSECOND// It's not a problem to name another variable aVar. Its true name is SECOND_aVar.varaVarnamespaceFIRST// This code is now back to the FIRST namespace until the file ends.

Functions

All X# executive code should be placed in functions defined by the 'function' keyword. Unlike C, X# does not support any formal parameter declaration in the header of the functions, so the conventional parentheses after the function name are omitted. Because line-fixed patterns are specified in syntax implemented in code parser, the opening curly bracket can't be placed on the next line, unlike in many other C-style languages.

functionxSharpFunction{// function code}

Because X# is a low-level language, there are no stack frames inserted, so by default, the return EIP address should be on the top of the stack. X# function calls do contain arguments enclosed in parentheses, unlike in function headers. Arguments passed to functions can be registers, addresses, or constants. These arguments are pushed onto the stack in reverse order. Note that the stack on x86 platforms cannot push or pop one-byte registers.

functionxSharpFunction{EAX=$10anotherFunction(EAX);return}functionanotherFunction{//function code}

The return keyword returns execution to the return EIP address saved in the stack.

Arithmetic and bitwise operations

X# can work with three low-level data structures: the registers, the stack and the memory, on different ports. The registers are the base of all normal operations for X#. A register can be copied to another by writing DST = SRC as opposed to mov or load/store instructions. Registers can be incremented or decremented just as easily. Arithmetic operations (add, subtract, multiply, divide) are written as dest op src where src is a constant, variable, or register, and dest is both an operand and the location where the result is stored.

Examples of assignment and arithmetic operations are shown below.

ESI=12345// assign 12345 to ESIEDX=#constantForEDX// assign #ConstantForEDX to EDXEAX=EBX// move EBX to EAX              => mov eax, ebxEAX--// decrement EAX                => dec eaxEAX++// increment EAX                => inc eaxEAX+2// add 2 to eax                 => add eax, 2EAX-$80// subtract 0x80 from eax       => sub eax, 0x80BX*CX// multiply BX by CX            => mul cx      -- division, multiplication and modulo should preserve registersCX/BX// divide CX by BX              => div bxCXmodBX// remainder of CX/BX to BX     => div bx

Register shifting and rolling is similar to C.

DX<<10// shift left by 10 bitsCX>>8// shift right by 8 bitsEAX<~6// rotate left by 6 bitsEAX~>4// rotate right by 4 bits

Other bitwise operations are similar to arithmetic operations.

DL&$08// perform bitwise AND on DL with 0x08 and store the result in DLCX|1// set the lowest bit of CX to 1 (make it odd)EAX=~ECX// perform bitwise NOT on ECX and store the result in EAXEAX^EAX// erase EAX by XORing it with itself

Stack

Stack manipulation in X# is performed using + and - prefixes, where + pushes a register, value, constant or all registers onto the stack and - pops a value to some register. All constants are pushed on stack as double words, unless stated otherwise (pushing single bytes is not supported).

+ESI// push esi-EDI// pop into edi+All// save all registers   => pushad-All// load all registers   => popad+$1badboo2// push 0x1badboo2 on the stack+$cafeasword//          \/+$babeasword// push 0xcafebabe+#VideoMemory// push value of constant VideoMemory

Variables

Variables are defined within namespaces using the var keyword. Arrays are defined by specifying the type and size. Variables and arrays are zeroed by default. To reference a variable's value, use a dot ('.'), and to reference its address, use @.

namespaceXSharpVariablesvarzeroVar// variable will be assigned zerovarmyVar1=$f000beef// variable will be assigned 0xf000beefvarsomeString='HelloXSharp!'// variable will be assigned 'Hello XSharp!\0',varbufferbyte[1024]// variable of size 1024 bytes will be assigned 1024 zero bytes...EAX=.myVar1// moves value of myVar1 (0xf000beef) to EAXESI=@.someString// moves address of someString to ESICL=.someString// moves first character of someString ('H') to CL.zeroVar=EAX// assigns zeroVar to value of EAX

X# can access an address with a specified offset using square brackets:

varsomeString='HelloXSharp!'//variable will be assigned to 'Hello XSharp!\0'...ESI=@.someString// load address of someString to ESICL='B'// set CL to 'B' (rewrite 'H' on the start)CH=ESI[1]// move second character ('E') from string to CHESI[4]=$00// end string//Value of someString will be 'Bell' (or 'Bell\0 XSharp!\0')

Comparison

There are two ways to compare values in X#: pure comparison and if-comparison.

  • Pure comparison leaves the result in FLAGS, which can be used in native assembly or with the if keyword without specifying comparison members.
  • If comparison directly compares two members after an if keyword.

Here are two ways of writing a (slow) X# string length (strlen)function:

// Method 1: using pure comparisonfunctionstrlen{ESI=ESP[4]// get pointer to string passed as first argumentECX^ECX// clear ECXLoop:AL=ESI[ECX]// get next characterAL?=0// is it 0? save to FLAGSif=return// if ZF is set, returnECX++// else increment ECXgotoLoop// loop...//Way 2: using iffunctionstrlen{ESI=ESP[4]// get pointer to string passed as first argumentECX^ECX// clear ECXLoop:AL=ESI[ECX]ifAL=0return// AL = 0? returnECX++gotoLoop// loop....}

There are six available comparison operators: < > = <= >= !=. These operators can be used in both comparisons and loops. Note that there's also a bitwise AND operator which tests bits:

AL?&$80// test AL MSBif=return// if ZF is 0, test instruction resulted in 0 and MSB is not set.

Writing Cosmos code

An operating system built with Cosmos is developed in a similar fashion to any .NET C# console program. Additional references are made in the start of the program which give access to the Cosmos libraries.

User Kit and Visual Studio

The Cosmos User Kit is a part of Cosmos designed to make Cosmos easier to use for developers using Microsoft Visual Studio. When installed, the user kit adds a new project type to Visual Studio, called a Cosmos Project. This is a modified version of a console application, with the Cosmos compiler and bootup stub code already added.

Compiling a project

Once the code is complete, it may be compiled using Roslyn, the .NET compiler, either via Microsoft Visual Studio or the .NET command-line tools (dotnet).

This converts the application from the original source code (C# or otherwise) into Common Intermediate Language (CIL), the native intermediate language of .NET.

The build process then invokes the IL2CPU compiler which systematically scans through all of the application's CIL code (excluding the Cosmos compiler code), converting it into assembly language for the selected processor architecture. As of 2022, only the x86 architecture is supported. Next, Cosmos invokes the selected assembler to convert this assembly language code into native central processing unit (CPU) opcode. Finally, the desired output option is activated, be it starting a virtual machine, starting a PXE engine, or producing an ISO disk image file.

Debug options

Cosmos offers several options as to how to deploy the resulting OS and how to debug the output.

Virtualization

The default Cosmos template as seen in QEMU. Cosmos Default Boot.png
The default Cosmos template as seen in QEMU.

Cosmos allows users to boot the operating system in an emulated environment using a virtual machine. This lets developers test the system on their own computer without having to reboot, giving the advantages of not requiring extra hardware or that developers exit their integrated development environment (IDE). VMware is the primary virtualisation method, however others are supported such as QEMU and Hyper-V.

Disk images

This option writes the operating system to a disk image (ISO image) file, which can be loaded into some emulators (such as Bochs, QEMU or more commonly VMware) or written to a USB flash drive and booted on physical hardware.

PXE network boot

This option allows the operating system to boot on physical hardware. The data is sent via a local area network (LAN) to the client machine. This requires two computers: one as the client machine (on which the OS is booted) and one as the server (usually the development machine). It also requires a network connecting the two computers, a client machine with a network card, and a Basic Input/Output System (BIOS) that can boot with PXE. As of 2022, debugging over a network is no longer supported.

See also

Related Research Articles

<span class="mw-page-title-main">Assembly language</span> Low-level programming language

In computer programming, assembly language, often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction (1:1), but constants, comments, assembler directives, symbolic labels of, e.g., memory locations, registers, and macros are generally also supported.

<span class="mw-page-title-main">Common Lisp</span> Programming language standard

Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ANSI INCITS 226-1994 (S2018). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived from the ANSI Common Lisp standard.

<span class="mw-page-title-main">Smalltalk</span> Object-oriented programming language released first in 1972

Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learning Research Group (LRG) scientists, including Alan Kay, Dan Ingalls, Adele Goldberg, Ted Kaehler, Diana Merry, and Scott Wallace.

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CIL-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture; commands or functions in the language are structurally similar to a processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language. As it has developed, it has drawn inspiration from other high-level programming languages. Notably, it has been influenced by Java, Python, Ruby, C#, and Eiffel.

In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".

<span class="mw-page-title-main">Perl module</span>

A Perl module is a discrete component of software for the Perl programming language. Technically, it is a particular set of conventions for using Perl's package mechanism that has become universally adopted.

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

IP Pascal is an implementation of the Pascal programming language using the IP portability platform, a multiple machine, operating system and language implementation system. It implements the language "Pascaline", and has passed the Pascal Validation Suite.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

<span class="mw-page-title-main">Oxygene (programming language)</span> Object Pascal-based programming language

Oxygene is a programming language developed by RemObjects Software for Microsoft's Common Language Infrastructure, the Java Platform and Cocoa. Oxygene is based on Delphi's Object Pascal, but also has influences from C#, Eiffel, Java, F# and other languages.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

C# and Visual Basic (.NET) are the two main programming languages used to program on the .NET framework.

This article describes the calling conventions used when programming x86 architecture microprocessors.

<span class="mw-page-title-main">Vala (programming language)</span> Programming language

Vala is an object-oriented programming language with a self-hosting compiler that generates C code and uses the GObject system.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

Toi is an imperative, type-sensitive language that provides the basic functionality of a programming language. The language was designed and developed from the ground-up by Paul Longtine. Written in C, Toi was created with the intent to be an educational experience and serves as a learning tool for those looking to familiarize themselves with the inner-workings of a programming language.

References

  1. 1 2 Cosmos website: project repository at GitHub
  2. "Change bootloader to Limine · Pull Request #2521 · CosmosOS/Cosmos · GitHub". GitHub.

News coverage