File Transfer Protocol

Last updated

The File Transfer Protocol (FTP) is a standard network protocol used for the transfer of computer files between a client and server on a computer network.

A computer file is a computer resource for recording data discretely in a computer storage device. Just as words can be written to paper, so can information be written to a computer file. Files can be edited and transferred through the internet on that particular computer system.

Client–server model Distributed application structure in computing

Client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client does not share any of its resources, but it requests content or service from a server. Clients therefore initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are Email, network printing, and the World Wide Web.

Computer network collection of autonomous computers interconnected by a single technology

A computer network is a digital telecommunications network which allows nodes to share resources. In computer networks, computing devices exchange data with each other using connections between nodes. These data links are established over cable media such as wires or optic cables, or wireless media such as Wi-Fi.

Contents

FTP is built on a client-server model architecture using separate control and data connections between the client and the server. [1] FTP users may authenticate themselves with a clear-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. For secure transmission that protects the username and password, and encrypts the content, FTP is often secured with SSL/TLS (FTPS) or replaced with SSH File Transfer Protocol (SFTP).

Transport Layer Security (TLS), and its now-deprecated predecessor, Secure Sockets Layer (SSL), are cryptographic protocols designed to provide communications security over a computer network. Several versions of the protocols find widespread use in applications such as web browsing, email, instant messaging, and voice over IP (VoIP). Websites can use TLS to secure all communications between their servers and web browsers.

FTPS is an extension to the commonly used File Transfer Protocol (FTP) that adds support for the Transport Layer Security (TLS) and, formerly, the Secure Sockets Layer cryptographic protocols.

In computing, the SSH File Transfer Protocol is a network protocol that provides file access, file transfer, and file management over any reliable data stream. It was designed by the Internet Engineering Task Force (IETF) as an extension of the Secure Shell protocol (SSH) version 2.0 to provide secure file transfer capabilities. The IETF Internet Draft states that, even though this protocol is described in the context of the SSH-2 protocol, it could be used in a number of different applications, such as secure file transfer over Transport Layer Security (TLS) and transfer of management information in VPN applications.

The first FTP client applications were command-line programs developed before operating systems had graphical user interfaces, and are still shipped with most Windows, Unix, and Linux operating systems. [2] [3] Many FTP clients and automation utilities have since been developed for desktops, servers, mobile devices, and hardware, and FTP has been incorporated into productivity applications, such as HTML editors.

Command-line interface Type of computer interface based on entering text commands and viewing text output

A command-line interface (CLI) is a means of interacting with a computer program where the user issues commands to the program in the form of successive lines of text. The program which handles the interface is called a command-line interpreter or command-line processor.

Operating system software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.

Graphical user interface user interface allowing interaction through graphical icons and visual indicators

The graphical user interface is a form of user interface that allows users to interact with electronic devices through graphical icons and visual indicators such as secondary notation, instead of text-based user interfaces, typed command labels or text navigation. GUIs were introduced in reaction to the perceived steep learning curve of command-line interfaces (CLIs), which require commands to be typed on a computer keyboard.

History of FTP servers

The original specification for the File Transfer Protocol was written by Abhay Bhushan and published as RFC   114 on 16 April 1971. Until 1980, FTP ran on NCP, the predecessor of TCP/IP. [2] The protocol was later replaced by a TCP/IP version, RFC   765 (June 1980) and RFC   959 (October 1985), the current specification. Several proposed standards amend RFC   959, for example RFC   1579 (February 1994) enables Firewall-Friendly FTP (passive mode), RFC   2228 (June 1997) proposes security extensions, RFC   2428 (September 1998) adds support for IPv6 and defines a new type of passive mode. [4]

Abhay K. Bhushan has been a major contributor to the development of the Internet TCP/IP architecture, and is the author of the File Transfer Protocol and the early versions of email protocols. He is currently chairman of Asquare Inc. and President of the IIT-Kanpur Foundation.

The Network Control Program (NCP) provided the middle layers of the protocol stack running on host computers of the ARPANET, the predecessor to the modern Internet.

The Internet protocol suite is the conceptual model and set of communications protocols used in the Internet and similar computer networks. It is commonly known as TCP/IP because the foundational protocols in the suite are the Transmission Control Protocol (TCP) and the Internet Protocol (IP). During its development, versions of it were known as the Department of Defense (DoD) model because the development of the networking method was funded by the United States Department of Defense through DARPA.

Protocol overview

Communication and data transfer

Illustration of starting a passive connection using port 21 Passive FTP Verbindung.svg
Illustration of starting a passive connection using port 21

FTP may run in active or passive mode, which determines how the data connection is established. [5] In both cases, the client creates a TCP control connection from a random, usually an unprivileged, port N to the FTP server command port 21.

In computer networking, a port is a communication endpoint. Physical as well as wireless connections are terminated at ports of hardware devices. At the software level, within an operating system, a port is a logical construct that identifies a specific process or a type of network service. Ports are identified for each protocol and address combination by 16-bit unsigned numbers, commonly known as the port number. The most common protocols that use port numbers are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP).

In computing, a firewall is a network security system that monitors and controls incoming and outgoing network traffic based on predetermined security rules. A firewall typically establishes a barrier between a trusted internal network and untrusted external network, such as the Internet.

Both modes were updated in September 1998 to support IPv6. Further changes were introduced to the passive mode at that time, updating it to extended passive mode. [7]

The server responds over the control connection with three-digit status codes in ASCII with an optional text message. For example, "200" (or "200 OK") means that the last command was successful. The numbers represent the code for the response and the optional text represents a human-readable explanation or request (e.g. <Need account for storing file>). [1] An ongoing transfer of file data over the data connection can be aborted using an interrupt message sent over the control connection.

While transferring data over the network, four data representations can be used: [2] [3] [4]

For text files, different format control and record structure options are provided. These features were designed to facilitate files containing Telnet or ASA.

Data transfer can be done in any of three modes: [1] [2]

Some FTP software also implements a DEFLATE-based compressed mode, sometimes called "Mode Z" after the command that enables it. This mode was described in an Internet Draft, but not standardized. [8]

Login

FTP login uses normal username and password scheme for granting access. [2] The username is sent to the server using the USER command, and the password is sent using the PASS command. [2] This sequence is unencrypted "on the wire", so may be vulnerable to a network sniffing attack. [9] If the information provided by the client is accepted by the server, the server will send a greeting to the client and the session will commence. [2] If the server supports it, users may log in without providing login credentials, but the same server may authorize only limited access for such sessions. [2]

Anonymous FTP

A host that provides an FTP service may provide anonymous FTP access. [2] Users typically log into the service with an 'anonymous' (lower-case and case-sensitive in some FTP servers) account when prompted for user name. Although users are commonly asked to send their email address instead of a password, [3] no verification is actually performed on the supplied data. [10] Many FTP hosts whose purpose is to provide software updates will allow anonymous logins. [3]

NAT and firewall traversal

FTP normally transfers data by having the server connect back to the client, after the PORT command is sent by the client. This is problematic for both NATs and firewalls, which do not allow connections from the Internet towards internal hosts. [11] For NATs, an additional complication is that the representation of the IP addresses and port number in the PORT command refer to the internal host's IP address and port, rather than the public IP address and port of the NAT.

There are two approaches to solve this problem. One is that the FTP client and FTP server use the PASV command, which causes the data connection to be established from the FTP client to the server. [11] This is widely used by modern FTP clients. Another approach is for the NAT to alter the values of the PORT command, using an application-level gateway for this purpose. [11]

Differences from HTTP

HTTP essentially fixes the bugs in FTP that made it inconvenient to use for many small ephemeral transfers as are typical in web pages.

FTP has a stateful control connection which maintains a current working directory and other flags, and each transfer requires a secondary connection through which the data are transferred. In "passive" mode this secondary connection is from client to server, whereas in the default "active" mode this connection is from server to client. This apparent role reversal when in active mode, and random port numbers for all transfers, is why firewalls and NAT gateways have such a hard time with FTP. HTTP is stateless and multiplexes control and data over a single connection from client to server on well-known port numbers, which trivially passes through NAT gateways and is simple for firewalls to manage.

Setting up an FTP control connection is quite slow due to the round-trip delays of sending all of the required commands and awaiting responses, so it is customary to bring up a control connection and hold it open for multiple file transfers rather than drop and re-establish the session afresh each time. In contrast, HTTP originally dropped the connection after each transfer because doing so was so cheap. While HTTP has subsequently gained the ability to reuse the TCP connection for multiple transfers, the conceptual model is still of independent requests rather than a session.

When FTP is transferring over the data connection, the control connection is idle. If the transfer takes too long, the firewall or NAT may decide that the control connection is dead and stop tracking it, effectively breaking the connection and confusing the download. The single HTTP connection is only idle between requests and it is normal and expected for such connections to be dropped after a time-out.

Web browser support

Most common web browsers can retrieve files hosted on FTP servers, although they may not support protocol extensions such as FTPS. [3] [12] When an FTP—rather than an HTTP—URL is supplied, the accessible contents on the remote server are presented in a manner that is similar to that used for other web content. A full-featured FTP client can be run within Firefox in the form of an extension called FireFTP.

As of 2019, major browsers such as Chrome and Firefox are deprecating FTP support to varying degrees, [13] with Google planning to remove it entirely by Chrome 82. Mozilla is currently discussing proposals, including only removing support for old FTP implementations that are no longer in use to simplify their code. [14] [15]

Syntax

FTP URL syntax is described in RFC   1738, taking the form: ftp://[user[:password]@]host[:port]/url-path (the bracketed parts are optional).

For example, the URL ftp://public.ftp-servers.example.com/mydirectory/myfile.txt represents the file myfile.txt from the directory mydirectory on the server public.ftp-servers.example.com as an FTP resource. The URL ftp://user001:secretpassword@private.ftp-servers.example.com/mydirectory/myfile.txt adds a specification of the username and password that must be used to access this resource.

More details on specifying a username and password may be found in the browsers' documentation (e.g., Firefox [16] and Internet Explorer [17] ). By default, most web browsers use passive (PASV) mode, which more easily traverses end-user firewalls.

Some variation has existed in how different browsers treat path resolution in cases where there is a non-root home directory for a user. [18]

Security

FTP was not designed to be a secure protocol, and has many security weaknesses. [19] In May 1999, the authors of RFC   2577 listed a vulnerability to the following problems:

FTP does not encrypt its traffic; all transmissions are in clear text, and usernames, passwords, commands and data can be read by anyone able to perform packet capture (sniffing) on the network. [2] [19] This problem is common to many of the Internet Protocol specifications (such as SMTP, Telnet, POP and IMAP) that were designed prior to the creation of encryption mechanisms such as TLS or SSL. [4]

Common solutions to this problem include:

  1. Using the secure versions of the insecure protocols, e.g., FTPS instead of FTP and TelnetS instead of Telnet.
  2. Using a different, more secure protocol that can handle the job, e.g. SSH File Transfer Protocol or Secure Copy Protocol.
  3. Using a secure tunnel such as Secure Shell (SSH) or virtual private network (VPN).

FTP over SSH

FTP over SSH is the practice of tunneling a normal FTP session over a Secure Shell connection. [19] Because FTP uses multiple TCP connections (unusual for a TCP/IP protocol that is still in use), it is particularly difficult to tunnel over SSH. With many SSH clients, attempting to set up a tunnel for the control channel (the initial client-to-server connection on port 21) will protect only that channel; when data is transferred, the FTP software at either end sets up new TCP connections (data channels) and thus have no confidentiality or integrity protection.

Otherwise, it is necessary for the SSH client software to have specific knowledge of the FTP protocol, to monitor and rewrite FTP control channel messages and autonomously open new packet forwardings for FTP data channels. Software packages that support this mode include:

Derivatives

FTPS

Explicit FTPS is an extension to the FTP standard that allows clients to request FTP sessions to be encrypted. This is done by sending the "AUTH TLS" command. The server has the option of allowing or denying connections that do not request TLS. This protocol extension is defined in RFC   4217. Implicit FTPS is an outdated standard for FTP that required the use of a SSL or TLS connection. It was specified to use different ports than plain FTP.

SSH File Transfer Protocol

The SSH file transfer protocol (chronologically the second of the two protocols abbreviated SFTP) transfers files and has a similar command set for users, but uses the Secure Shell protocol (SSH) to transfer files. Unlike FTP, it encrypts both commands and data, preventing passwords and sensitive information from being transmitted openly over the network. It cannot interoperate with FTP software.

Trivial File Transfer Protocol

Trivial File Transfer Protocol (TFTP) is a simple, lock-step FTP that allows a client to get a file from or put a file onto a remote host. One of its primary uses is in the early stages of booting from a local area network, because TFTP is very simple to implement. TFTP lacks security and most of the advanced features offered by more robust file transfer protocols such as File Transfer Protocol. TFTP was first standardized in 1981 and the current specification for the protocol can be found in RFC   1350.

Simple File Transfer Protocol

Simple File Transfer Protocol (the first protocol abbreviated SFTP), as defined by RFC   913, was proposed as an (unsecured) file transfer protocol with a level of complexity intermediate between TFTP and FTP. It was never widely accepted on the Internet, and is now assigned Historic status by the IETF. It runs through port 115, and often receives the initialism of SFTP. It has a command set of 11 commands and support three types of data transmission: ASCII, binary and continuous. For systems with a word size that is a multiple of 8 bits, the implementation of binary and continuous is the same. The protocol also supports login with user ID and password, hierarchical folders and file management (including rename, delete, upload, download, download with overwrite, and download with append).

FTP commands

FTP reply codes

Below is a summary of FTP reply codes that may be returned by an FTP server. These codes have been standardized in RFC   959 by the IETF. The reply code is a three-digit value. The first digit is used to indicate one of three possible outcomes — success, failure, or to indicate an error or incomplete reply:

The second digit defines the kind of error:

The third digit of the reply code is used to provide additional detail for each of the categories defined by the second digit.

See also

Related Research Articles

The Simple Mail Transfer Protocol (SMTP) is a communication protocol for electronic mail transmission. As an Internet standard, SMTP was first defined in 1982 by RFC 821, and updated in 2008 by RFC 5321 to Extended SMTP additions, which is the protocol variety in widespread use today. Mail servers and other message transfer agents use SMTP to send and receive mail messages. Proprietary systems such as Microsoft Exchange and IBM Notes and webmail systems such as Outlook.com, Gmail and Yahoo! Mail may use non-standard protocols internally, but all use SMTP when sending to or receiving email from outside their own systems. SMTP servers commonly use the Transmission Control Protocol on port number 25.

Secure Shell (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Typical applications include remote command-line, login, and remote command execution, but any network service can be secured with SSH.

Telnet is an application protocol used on the Internet or local area network to provide a bidirectional interactive text-oriented communication facility using a virtual terminal connection. User data is interspersed in-band with Telnet control information in an 8-bit byte oriented data connection over the Transmission Control Protocol (TCP).

Email client computer software that allows sending and receiving emails

An email client, email reader or more formally mail user agent (MUA) is a computer program used to access and manage a user's email.

Trivial File Transfer Protocol (TFTP) is a simple lockstep File Transfer Protocol which allows a client to get a file from or put a file onto a remote host. One of its primary uses is in the early stages of nodes booting from a local area network. TFTP has been used for this application because it is very simple to implement.

Network address translation Protocol facilitating connection of one IP address space to another.

Network address translation (NAT) is a method of remapping one IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device. The technique was originally used as a shortcut to avoid the need to readdress every host when a network was moved. It has become a popular and essential tool in conserving global address space in the face of IPv4 address exhaustion. One Internet-routable IP address of a NAT gateway can be used for an entire private network.

SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. SOCKS5 additionally provides authentication so only authorized users may access a server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

Direct Client-to-Client (DCC) is an IRC-related sub-protocol enabling peers to interconnect using an IRC server for handshaking in order to exchange files or perform non-relayed chats. Once established, a typical DCC session runs independently from the IRC server. Originally designed to be used with ircII it is now supported by many IRC clients. Some peer-to-peer clients on napster-protocol servers also have DCC send/get capability, including TekNap, SunshineUN and Lopster. A variation of the DCC protocol called SDCC, also known as DCC SCHAT supports encrypted connections. An RFC specification on the use of DCC does not exist.

Secure copy protocol (SCP) is a means of securely transferring computer files between a local host and a remote host or between two remote hosts. It is based on the Secure Shell (SSH) protocol. "SCP" commonly refers to both the Secure Copy Protocol and the program itself. According to OpenSSH developers in April 2019 the scp protocol is outdated, inflexible and not readily fixed.

Port forwarding

In computer networking, port forwarding or port mapping is an application of network address translation (NAT) that redirects a communication request from one address and port number combination to another while the packets are traversing a network gateway, such as a router or firewall. This technique is most commonly used to make services on a host residing on a protected or masqueraded (internal) network available to hosts on the opposite side of the gateway, by remapping the destination IP address and port number of the communication to an internal host.

File eXchange Protocol (FXP) and (FXSP) is a method of data transfer which uses FTP to transfer data from one remote server to another (inter-server) without routing this data through the client's connection. Conventional FTP involves a single server and a single client; all data transmission is done between these two. In the FXP session, a client maintains a standard FTP connection to two servers, and can direct either server to connect to the other to initiate a data transfer. The advantage of using FXP over FTP is evident when a high-bandwidth server demands resources from another high-bandwidth server, but only a low-bandwidth client, such as a network administrator working away from location, has the authority to access the resources on both servers.

OpenVPN is an open-source commercial software that implements virtual private network (VPN) techniques to create secure point-to-point or site-to-site connections in routed or bridged configurations and remote access facilities. It uses a custom security protocol that utilizes SSL/TLS for key exchange. It is capable of traversing network address translators (NATs) and firewalls. It was written by James Yonan and is published under the GNU General Public License (GPL).

In computer networks, a tunneling protocol is a communications protocol that allows for the movement of data from one network to another. It involves allowing private network communications to be sent across a public network through a process called encapsulation.

This article lists communication protocols that are designed for file transfer over a telecommunications network.

Ettercap is a free and open source network security tool for man-in-the-middle attacks on LAN. It can be used for computer network protocol analysis and security auditing. It runs on various Unix-like operating systems including Linux, Mac OS X, BSD and Solaris, and on Microsoft Windows. It is capable of intercepting traffic on a network segment, capturing passwords, and conducting active eavesdropping against a number of common protocols. Its original developers later founded Hacking Team.

Network address translator traversal is a computer networking technique of establishing and maintaining Internet protocol connections across gateways that implement network address translation (NAT).

In the context of computer networking, an application-level gateway consists of a security component that augments a firewall or NAT employed in a computer network. It allows customized NAT traversal filters to be plugged into the gateway to support address and port translation for certain application layer "control/data" protocols such as FTP, BitTorrent, SIP, RTSP, file transfer in IM applications, etc. In order for these protocols to work through NAT or a firewall, either the application has to know about an address/port number combination that allows incoming packets, or the NAT has to monitor the control traffic and open up port mappings dynamically as required. Legitimate application data can thus be passed through the security checks of the firewall or NAT that would have otherwise restricted the traffic for not meeting its limited filter criteria.

curl-loader is an open-source software performance testing tool written in the C programming language.

The Ident Protocol, specified in RFC 1413, is an Internet protocol that helps identify the user of a particular TCP connection. One popular daemon program for providing the ident service is identd.

References

  1. 1 2 3 Forouzan, B.A. (2000). TCP/IP: Protocol Suite (1st ed.). New Delhi, India: Tata McGraw-Hill Publishing Company Limited.
  2. 1 2 3 4 5 6 7 8 9 10 Kozierok, Charles M. (2005). "The TCP/IP Guide v3.0". Tcpipguide.com.
  3. 1 2 3 4 5 Dean, Tamara (2010). Network+ Guide to Networks. Delmar. pp. 168–171.
  4. 1 2 3 4 Clark, M.P. (2003). Data Networks IP and the Internet (1st ed.). West Sussex, England: John Wiley & Sons Ltd.
  5. 1 2 "Active FTP vs. Passive FTP, a Definitive Explanation". Slacksite.com.
  6. RFC   959 (Standard) File Transfer Protocol (FTP). Postel, J. & Reynolds, J. (October 1985).
  7. RFC   2428 (Proposed Standard) Extensions for IPv6, NAT, and Extended Passive Mode. Allman, M. & Metz, C. & Ostermann, S. (September 1998).
  8. Preston, J. (January 2005). Deflate transmission mode for FTP. IETF. I-D draft-preston-ftpext-deflate-03.txt. Retrieved 27 January 2016.
  9. Prince, Brian. "Should Organizations Retire FTP for Security?". Security Week. Security Week. Retrieved 14 September 2017.
  10. RFC   1635 (Informational) How to Use Anonymous FTP. P. & Emtage, A. & Marine, A. (May 1994).
  11. 1 2 3 Gleason, Mike (2005). "The File Transfer Protocol and Your Firewall/NAT". Ncftp.com.
  12. Matthews, J. (2005). Computer Networking: Internet Protocols in Action (1st ed.). Danvers, MA: John Wiley & Sons Inc.
  13. https://www.bleepingcomputer.com/news/google/chrome-and-firefox-developers-aim-to-remove-support-for-ftp/
  14. https://bugzilla.mozilla.org/show_bug.cgi?id=1574475
  15. https://www.chromestatus.com/feature/6246151319715840
  16. "Accessing FTP servers | How to | Firefox Help". Support.mozilla.com. 5 September 2012. Retrieved 16 January 2013.
  17. "How to Enter FTP Site Password in Internet Explorer". Support.microsoft.com. 23 September 2011. Retrieved 28 March 2015. Written for IE versions 6 and earlier. Might work with newer versions.
  18. Jukka “Yucca” Korpela (18 September 1997). "FTP URLs". "IT and communication" (www.cs.tut.fi/~jkorpela/). Retrieved 6 January 2016.
  19. 1 2 3 "Securing FTP using SSH". Nurdletech.com.
  20. "Access using SSH keys & PCI DSS compliance". ssh.com.

Further reading