Distributed File System (Microsoft)

Last updated

Distributed File System (DFS) is a set of client and server services that allow an organization using Microsoft Windows servers to organize many distributed SMB file shares into a distributed file system. DFS has two components to its service: Location transparency (via the namespace component) and Redundancy (via the file replication component). Together, these components enable data availability in the case of failure or heavy load by allowing shares in multiple different locations to be logically grouped under one folder, the "DFS root".

Contents

Microsoft's DFS is referred to interchangeably as 'DFS' and 'Dfs' by Microsoft and is unrelated to the DCE Distributed File System, which held the 'DFS' trademark [1] but was discontinued in 2005.

It is also called "MS-DFS" or "MSDFS" in some contexts, e.g. in the Samba user space project. [2]

Overview

There is no requirement to use the two components of DFS together; it is perfectly possible to use the logical namespace component without using DFS file replication, and it is perfectly possible to use file replication between servers without combining them into one namespace.

A DFS root can only exist on a server version of Windows (from Windows NT 4.0 and up) and OpenSolaris [3] (in kernel space) or a computer running Samba (in user space.) The Enterprise and Datacenter Editions of Windows Server can host multiple DFS roots on the same server. OpenSolaris intends on supporting multiple DFS roots in "a future project based on Active Directory (AD) domain-based DFS namespaces". [4]

There are two ways of implementing DFS on a server:

DFS namespaces

Traditional file shares, associated with a single server, have SMB paths of the form

\\<SERVER>\<path>\<subpath>

Domain-based DFS file share paths are distinguished by using the domain name in place of the server name, in the form

\\<DOMAIN.NAME>\<dfsroot>\<path>

When a user accesses such a share, either directly or by mapping a drive, their computer will access one of the available servers associated with that share, following rules which can be configured by the network administrator. For example, the default behaviour is that users will access the closest server to them; but this can be overridden to prefer a particular server.

If a server fails, the client can select a different server transparently to the user. One major caveat regarding this flexibility is that currently-open files will potentially become unusable, as open files cannot be failed-over.

DFS replication

Early versions of DFS used Microsoft's File Replication Service (FRS) which provides basic file replication capability between servers. FRS identifies changed or new files, and copies the latest version of the entire file to all servers.

Windows Server 2003 R2 introduced "DFS Replication" (DFSR) which improves on FRS by only copying those parts of files which have changed (remote differential compression), by using data compression to reduce network traffic, and by allowing administrators flexible configuration options for limiting network traffic with a customizable schedule.

History

The server component of Distributed File System was first introduced as an add-on to Windows NT 4.0 Server, called "DFS 4.1", [5] and was later included as a standard component of all editions of Windows 2000 Server. Client-side support is included in Windows NT 4.0 and later versions of Windows.

Linux kernels 2.6.14 and later [6] come with an SMB client VFS called "cifs" that supports DFS.

On Mac OS X DFS is supported natively in Mac OS X 10.7 ("Lion") onward. [7]

Specifications

There are a number of specifications that are relevant to DFS, they are available through the Microsoft Open Specifications program: [8]

See also

Related Research Articles

Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. Windows Server operating systems include it as a set of processes and services. Originally, only centralized domain management used Active Directory. However, it ultimately became an umbrella title for various directory-based identity-related services.

Samba is a free software re-implementation of the SMB networking protocol, and was originally developed by Andrew Tridgell. Samba provides file and print services for various Microsoft Windows clients and can integrate with a Microsoft Windows Server domain, either as a Domain Controller (DC) or as a domain member. As of version 4, it supports Active Directory and Microsoft Windows NT domains.

<span class="mw-page-title-main">Windows 2000</span> Fifth major release of Windows NT, released in 2000

Windows 2000 is a major release of the Windows NT operating system developed by Microsoft and designed for businesses. It was the direct successor to Windows NT 4.0, and was released to manufacturing on December 15, 1999, and was officially released to retail on February 17, 2000 and September 26, 2000 for Windows 2000 Datacenter Server. It was Microsoft's business operating system until the introduction of Windows XP Professional in 2001.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call system. NFS is an open IETF standard defined in a Request for Comments (RFC), allowing anyone to implement the protocol.

WebDAV is a set of extensions to the Hypertext Transfer Protocol (HTTP), which allows user agents to collaboratively author contents directly in an HTTP web server by providing facilities for concurrency control and namespace operations, thus allowing Web to be viewed as a writeable, collaborative medium and not just a read-only medium. WebDAV is defined in RFC 4918 by a working group of the Internet Engineering Task Force (IETF).

In computing, a directory service or name service maps the names of network resources to their respective network addresses. It is a shared information infrastructure for locating, managing, administering and organizing everyday items and network resources, which can include volumes, folders, files, printers, users, groups, devices, telephone numbers and other objects. A directory service is a critical component of a network operating system. A directory server or name server is a server which provides such a service. Each resource on the network is considered an object by the directory server. Information about a particular resource is stored as a collection of attributes associated with that resource or object.

<span class="mw-page-title-main">Server Message Block</span> Network communication protocol for providing shared access to resources

Server Message Block (SMB) is a communication protocol mainly used by Microsoft Windows equipped computers normally used to share files, printers, serial ports, and miscellaneous communications between nodes on a network. SMB implementation consists of two vaguely named Windows services: "Server" and "Workstation". It uses NTLM or Kerberos protocols for user authentication. It also provides an authenticated inter-process communication (IPC) mechanism.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

In computing, a named pipe is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication (IPC). The concept is also found in OS/2 and Microsoft Windows, although the semantics differ substantially. A traditional pipe is "unnamed" and lasts only as long as the process. A named pipe, however, can last as long as the system is up, beyond the life of the process. It can be deleted if no longer used. Usually a named pipe appears as a file, and generally processes attach to it for IPC.

LAN Manager is a discontinued network operating system (NOS) available from multiple vendors and developed by Microsoft in cooperation with 3Com Corporation. It was designed to succeed 3Com's 3+Share network server software which ran atop a heavily modified version of MS-DOS.

<span class="mw-page-title-main">Group Policy</span> Feature of the Microsoft Windows NT family of operating systems

Group Policy is a feature of the Microsoft Windows NT family of operating systems that controls the working environment of user accounts and computer accounts. Group Policy provides centralized management and configuration of operating systems, applications, and users' settings in an Active Directory environment. A set of Group Policy configurations is called a Group Policy Object (GPO). A version of Group Policy called Local Group Policy allows Group Policy Object management without Active Directory on standalone computers.

A Directory System Agent (DSA) is the element of an X.500 directory service that provides User Agents with access to a portion of the directory. X.500 is an international standard developed by the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU-T). The model and function of a directory system agent are specified in ITU-T Recommendation X.501.

Microsoft RPC is a modified version of DCE/RPC. Additions include partial support for UCS-2 strings, implicit handles, and complex calculations in the variable-length string and structure paradigms already present in DCE/RPC.

In a Windows network, NT LAN Manager (NTLM) is a suite of Microsoft security protocols intended to provide authentication, integrity, and confidentiality to users. NTLM is the successor to the authentication protocol in Microsoft LAN Manager (LANMAN), an older Microsoft product. The NTLM protocol suite is implemented in a Security Support Provider, which combines the LAN Manager authentication protocol, NTLMv1, NTLMv2 and NTLM2 Session protocols in a single package. Whether these protocols are used or can be used on a system which is governed by Group Policy settings, for which different versions of Windows have different default settings.

In computing, a shared resource, or network share, is a computer resource made available from one host to other hosts on a computer network. It is a device or piece of information on a computer that can be remotely accessed from another computer transparently as if it were a resource in the local machine. Network sharing is made possible by inter-process communication over the network.

File Replication Service (FRS) is a Microsoft Windows Server service for distributing shared files and Group Policy Objects. It replaced the Lan Manager Replication service, and has been partially replaced by Distributed File System Replication. It is also known as NTFRS after the name of the executable file that runs the service.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

In computer security, pass the hash is a hacking technique that allows an attacker to authenticate to a remote server or service by using the underlying NTLM or LanMan hash of a user's password, instead of requiring the associated plaintext password as is normally the case. It replaces the need for stealing the plaintext password to gain access with stealing the hash.

CIFSD is an open-source in-kernel CIFS/SMB server created by Namjae Jeon for the Linux kernel. Initially the goal is to provide improved file I/O performance, but the bigger goal is to have some new features which are much easier to develop and maintain inside the kernel and expose the layers fully. Directions can be attributed to sections where Samba is moving to a few modules inside the kernel to have features like Remote direct memory access (RDMA) to work with actual performance gain.

References

  1. "Dfs vs. DFS". Archived from the original on 2016-03-03. Retrieved 2014-02-02.
  2. "smb.conf man page, section host msdfs" . Retrieved 2018-03-07.
  3. "PSARC/2009/534 SMB/CIFS Standalone DFS". Archived from the original on 2010-06-15. Retrieved 2010-03-27.
  4. Template Version: @(#)onepager.txt 1.35 07/11/07 SMI Copyright 2007 Sun Micro-systems
  5. "DFS: When, Why, and How". Archived from the original on August 25, 2005.
  6. "LinuxCIFS utils - SambaWiki". Wiki.samba.org. Retrieved 2013-07-08.
  7. "OS X Lion: Guidelines for connecting to a DFS namespace via SMB". 2014-07-15. Retrieved 2016-12-06.
  8. "[MS-OPENSPECLP]: Open Specifications | Microsoft Docs". Microsoft. Retrieved 2020-10-22.