Unix2dos

Last updated
unix2dos & dos2unix
Original author(s) John Birchfield
Developer(s) Benjamin Lin, Bernd Johannes Wuebben, Christian Wurll, Erwin Waterlander
Initial release1989;35 years ago (1989)
Stable release
7.5.2 [1]   OOjs UI icon edit-ltr-progressive.svg / 22 January 2024;8 days ago (22 January 2024)
Repository
Operating system Unix-like, DOS, OS/2, Windows
Platform Cross-platform
Type Command
License FreeBSD style license
Website waterlan.home.xs4all.nl/dos2unix.html

unix2dos (sometimes named todos or u2d) is a tool to convert line breaks in a text file from Unix format (Line feed) to DOS format (carriage return + Line feed) and vice versa. When invoked as unix2dos the program will convert a Unix text file to DOS format, when invoked as dos2unix it will convert a DOS text file to Unix format. [2]

Contents

Usage

Unix2dos and dos2unix are not part of the Unix standard. Commercial Unixes usually come with their own implementation of unix2dos/dos2unix, like SunOS/Solaris's dos2unix/unix2dos, HP-UX's dos2ux/ux2dos and Irix's to_unix/to_dos.

There exist many open source alternatives with different command names and options like dos2unix/unix2dos, d2u/u2d, fromdos/todos, endlines, flip. The multi-call binary busybox includes an implementation of unix2dos/dos2unix.

See the manual page of the respective commands.

Alternatives to unix2dos conversion

Alternatives to dos2unix conversion

For the opposite conversion ( dos2unix ) it is possible to use, for example, the utility tr with the -d '\r' flag to remove the carriage return characters:

Note: The above method assumes there are only DOS line breaks in the input file. Any Mac line breaks (\r) present in the input will be removed.

An alternative to the dos2unix conversion is possible by using the col command that is available on Linux and other Unix-like operating systems, including Mac OS X. In the following case, InFile contains the undesired DOS (^M) line endings. After execution, OutFile is either created or replaced, and contains UNIX line endings. The -b option tells col not to output backspace characters.

$ col-b<InFile>OutFile 

See also

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

sed Standard UNIX utility for editing streams of data

sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.

<span class="mw-page-title-main">Shell script</span> Script written for the shell, or command line interpreter, of an operating system

A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup or logging, is called a wrapper.

Linux has several filesystem drivers for the File Allocation Table (FAT) filesystem format. These are commonly known by the names used in the mount command to invoke particular drivers in the kernel: msdos, vfat, and umsdos.

comm Standard UNIX utility for comparing files

The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.

<span class="mw-page-title-main">Newline</span> Special characters in computing signifying the end of a line of text

A newline is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a sequence of characters, is used to signify the end of a line of text and the start of a new one.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

dc is a cross-platform reverse-Polish calculator which supports arbitrary-precision arithmetic. It was written by Lorinda Cherry and Robert Morris at Bell Labs. It is one of the oldest Unix utilities, preceding even the invention of the C programming language. Like other utilities of that vintage, it has a powerful set of features but terse syntax. Traditionally, the bc calculator program was implemented on top of dc.

tr (Unix) Unix text formatting utility

tr is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. It is an abbreviation of translate or transliterate, indicating its operation of replacing or removing specific characters in its input data set.

wc (Unix) Unix command utility

wc is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. The program reads either standard input or a list of computer files and generates one or more of the following statistics: newline count, word count, and byte count. If a list of files is provided, both individual file and total statistics follow.

paste is a Unix command line utility which is used to join files horizontally by outputting lines consisting of the sequentially corresponding lines of each file specified, separated by tabs, to the standard output.

<span class="mw-page-title-main">Pipeline (Unix)</span> Mechanism for inter-process communication using message passing

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently. The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals". This in turn allows for more clarity and simplicity in the system.

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

more (command) Terminal pager available on various operating systems

In computing, more is a command to view the contents of a text file one screen at a time. It is available on Unix and Unix-like systems, DOS, Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2, Microsoft Windows and ReactOS. Programs of this sort are called pagers. more is a very basic pager, originally allowing only forward navigation through a file, though newer implementations do allow for limited backward movement.

tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.

sort (Unix) Standard UNIX utility

In computing, sort is a standard command line program of Unix and Unix-like operating systems, that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator. The command supports a number of command-line options that can vary by implementation. For instance the "-r" flag will reverse the sort order.

Toybox is a free and open-source software implementation of over 200 Unix command line utilities such as ls, cp, and mv. The Toybox project was started in 2006, and became a 0BSD licensed BusyBox alternative. Toybox is used for most of Android's command-line tools in all currently supported Android versions, and is also used to build Android on Linux and macOS. All of the tools are tested on Linux, and many of them also work on BSD and macOS.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as a user-friendly alternative to punched cards.

References

  1. "dos2unix / unix2dos - Text file format converters" . Retrieved 30 January 2024.
  2. A practical guide to Linux. Mark G. Sobell ISBN   0-13-147823-0 Page 55