MagmaFS

Last updated
Magma network filesystem
Magma logo.png
Developer(s) Tx0
Written in C
Operating system Linux and BSD kernels
Available inEnglish
Type Distributed file system
License GNU GPL
Website www.magmafs.net

Magma is a distributed file system based on a distributed hash table, written in C, compatible with Linux and BSD kernels using FUSE.

Contents

Terminology and basic principles

Magma binds several hosts interconnected by a TCP/IP network to form a common storage space called a lava ring. Each host (or node) is called a vulcano. Each vulcano hosts a portion of a common key space, delimited by two SHA1 keys. Each vulcano is also in charge of mirroring the key space of the previous node, to ensure data redundancy. Each key can represent one or more object inside the storage space. These objects are called flares.

Magma can store a different range of objects: files, directories, symbolic links, block and character devices, FIFO pipes. Each object is bound to a flare and vice versa. A flare of any type in the six listed above is described by some basic properties common to all flares, like a path and a hash key. But each of the six types has also its own specific properties. For example, directory flares will have some specific information that don't apply to symbolic links. A flare with only generic information is called uncast while a complete flare is called cast.

An uncast flare does not contain enough information to operate on data, but has enough information to be moved as a sort of opaque container between vulcano nodes. To be easily movable, each type of flare, including directories, has been reimplemented as a two files set, the first containing flare information (metadata) and the second containing flare content. Moving flares across lava ring is called load balancing and is done to leverage load inequalities between nodes in the attempt to provide best performance.

Flare system

The internal engine of Magma is called flare system and is implemented as a layered stack.

Flare system layers
1.Public API: flare_system_init(), magma_open(), magma_mknod(), magma_lstat(), ...
2.Lava network: magma_new_node(), route_key(), join_network()
3.Flare objects: magma_new_flare(), magma_search_or_create(), magma_add_to_cache(), ...

magma_mkdir() can be used as an example of layer traversing. In this paragraph will be assumed that a directory called /example will be created. magma_mkdir() is part of Public API layer. It is used to create a new directory, as done by standard Libc counterpart mkdir().

magma_mkdir() first route the request to decide if it can be locally managed or will require network operations. To perform routing, path /example is translated in corresponding SHA1 hash key 81f762fd59d88768b06b8e9de56aef8a95962045. If routing determines the need to contact another vulcano node, request will not leave this layer. Lava network layer will forward the request to node owning the key, continuing the flow of operations on remote node. Routing is half the role of Lava network layer, which also includes network monitoring and vulcano nodes creation, update and removal.

Both being a local or a remote request, the last step is performed by Flare layer. Flare corresponding to key 81f762fd59d88768b06b8e9de56aef8a95962045 will be searched inside the cache. If not found, it will be created and loaded from disk, if already existing. On resulting flare object are first applied permission checking tests. If permission to operate is granted, initial request is fulfilled: in this example, flare is cast to directory if it wasn't already and is saved to disk.

Routing

Since each vulcano node has complete network topology available, routing is just a matter of matching flare keys with nodes key-space and find the node holding the flare. Network topology is also saved in the distributed directory /.dht/ inside magma filesystem. Vulcano nodes can periodically check their information against contents of /.dht/ directory to know if something has changed. Nodes also periodically save their own information inside /.dht/ directory.

Load balancing

Each vulcano node has some parameters declared at boot, like bandwidth and storage available. A separate thread called balancer is devoted to distribute keys to avoid nodes overloading or underloading. Each node has a dynamic load value associated, which is computed by the formula:

where is the node key load calculated on logarithmic scale; is node bandwidth and is average bandwidth; is node storage and is average storage

Magma software distribution

Magma is distributed in the form of a server called magmad and a client called mount.magma.

Magma server

Magma server magmad manages intercommunication between DHT nodes and magma clients. Flare system provides a network event loop that accept incoming connections. Three kind of connection are accepted.

Magma client

Magma client magma.mount is based on FUSE, being compatible with Linux and BSD kernels. Magma client uses a flare protocol connection to contact and operate with a near Magma server. Network topology and flare location is totally transparent to clients. The client simply query one server in the exact manner as all information were located on that host only.

A cryptographic layer is planned for Magma client, allowing file contents only encryption. Implementing cryptography on client side is due to scalability (computational power will increase at same rate of computational request) and cryptographic key privacy (keys or passphrases will never reach the server).

Alternative NFS interface

As an alternative to Magma client, which is supported only by Linux and BSD kernels, Magma server plans to offer a NFS interface for other Unices. Since NFS is an established standard, no new features can be added. For example, cryptographic layer will be unavailable to client mounting Magma shares through NFS.

Related Research Articles

Client–server model Distributed application structure in computing

Client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are email, network printing, and the World Wide Web.

Freenet Peer-to-peer Internet platform for censorship-resistant communication

Freenet is a peer-to-peer platform for censorship-resistant communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web without fear of censorship. Both Freenet and some of its associated tools were originally designed by Ian Clarke, who defined Freenet's goal as providing freedom of speech on the Internet with strong anonymity protection.

Kerberos is a computer-network authentication protocol that works on the basis of tickets to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. The protocol was named after the character Kerberos from Greek mythology, the ferocious three-headed guard dog of Hades. Its designers aimed it primarily at a client–server model and it provides mutual authentication—both the user and the server verify each other's identity. Kerberos protocol messages are protected against eavesdropping and replay attacks.

Peer-to-peer Type of decentralized and distributed network architecture

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the application. They are said to form a peer-to-peer network of nodes.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call system. NFS is an open standard defined in a Request for Comments (RFC), allowing anyone to implement the protocol.

Distributed hash table Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key-value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

GNUnet Framework for decentralized, peer-to-peer networking which is part of the GNU Project

GNUnet is a software framework for decentralized, peer-to-peer networking and an official GNU package. The framework offers link encryption, peer discovery, resource allocation, communication over many transports and various basic peer-to-peer algorithms for routing, multicast and network size estimation.

NetWare is a discontinued computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, using the IPX network protocol.

Network-attached storage Computer data storage server

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. NAS is specialized for serving files either by its hardware, software, or configuration. It is often manufactured as a computer appliance – a purpose-built specialized computer. NAS systems are networked appliances that contain one or more storage drives, often arranged into logical, redundant storage containers or RAID. Network-attached storage removes the responsibility of file serving from other servers on the network. They typically provide access to files using network file sharing protocols such as NFS, SMB, or AFP. From the mid-1990s, NAS devices began gaining popularity as a convenient method of sharing files among multiple computers. Potential benefits of dedicated network-attached storage, compared to general-purpose servers also serving files, include faster data access, easier administration, and simple configuration.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values. In fact, the node ID provides a direct map to file hashes and that node stores information on where to obtain the file or resource.

In computer science and networking in particular, a session is a temporary and interactive information interchange between two or more communicating devices, or between a computer and user. A session is established at a certain point in time, and then ‘torn down’ - brought to an end - at some later point. An established communication session may involve more than one message in each direction. A session is typically stateful, meaning that at least one of the communicating parties needs to hold current state information and save information about the session history to be able to communicate, as opposed to stateless communication, where the communication consists of independent requests with responses.

The Secure Remote Password protocol (SRP) is an augmented password-authenticated key exchange (PAKE) protocol, specifically designed to work around existing patents.

Network Security Services

Network Security Services (NSS) is a collection of cryptographic computer libraries designed to support cross-platform development of security-enabled client and server applications with optional support for hardware TLS/SSL acceleration on the server side and hardware smart cards on the client side. NSS provides a complete open-source implementation of cryptographic libraries supporting Transport Layer Security (TLS) / Secure Sockets Layer (SSL) and S/MIME. NSS releases prior to version 3.14 are tri-licensed under the Mozilla Public License 1.1, the GNU General Public License, and the GNU Lesser General Public License. Since release 3.14, NSS releases are licensed under GPL-compatible Mozilla Public License 2.0.

Pastry is an overlay network and routing network for the implementation of a distributed hash table (DHT) similar to Chord. The key–value pairs are stored in a redundant peer-to-peer network of connected Internet hosts. The protocol is bootstrapped by supplying it with the IP address of a peer already in the network and from then on via the routing table which is dynamically built and repaired. It is claimed that because of its redundant and decentralized nature there is no single point of failure and any single node can leave the network at any time without warning and with little or no chance of data loss. The protocol is also capable of using a routing metric supplied by an outside program, such as ping or traceroute, to determine the best routes to store in its routing table.

The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of a parallel application. PVFS was designed for use in large scale cluster computing. PVFS focuses on high performance access to large data sets. It consists of a server process and a client library, both of which are written entirely of user-level code. A Linux kernel module and pvfs-client process allow the file system to be mounted and used with standard utilities. The client library provides for high performance access via the message passing interface (MPI). PVFS is being jointly developed between The Parallel Architecture Research Laboratory at Clemson University and the Mathematics and Computer Science Division at Argonne National Laboratory, and the Ohio Supercomputer Center. PVFS development has been funded by NASA Goddard Space Flight Center, The DOE Office of Science Advanced Scientific Computing Research program, NSF PACI and HECURA programs, and other government and private agencies. PVFS is now known as OrangeFS in its newest development branch.

In computing, Self-certifying File System (SFS) is a global and decentralized, distributed file system for Unix-like operating systems, while also providing transparent encryption of communications as well as authentication. It aims to be the universal distributed file system by providing uniform access to any available server, however, the usefulness of SFS is limited by the low deployment of SFS clients. It was developed in the June 2000 doctoral thesis of David Mazières.

A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

In the BitTorrent file distribution system, a torrent file or meta-info file is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. A torrent file does not contain the content to be distributed; it only contains information about those files, such as their names, folder structure, and sizes obtained via cryptographic hash values for verifying file integrity. The term torrent may refer either to the metadata file or to the files downloaded, depending on the context.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

Elliptics is a distributed key-value data storage with open source code. By default it is a classic distributed hash table (DHT) with multiple replicas put in different groups. Elliptics was created to meet requirements of multi-datacenter and physically distributed storage locations when storing huge amount of medium and large files.