Command-line argument parsing

Last updated

Different command-line argument parsing methods are used by different programming languages to parse command-line arguments.

Contents

Programming languages

C

C uses argv to process command-line arguments. [1] [2]

An example of C argument parsing would be:

#include<stdio.h>intmain(intargc,char*argv[]){intcount;for(count=0;count<argc;count++)puts(argv[count]);}

C also has functions called getopt and getopt_long.

C#

An example of C# argument parsing would be:

classProgram{staticvoidMain(string[]args){foreach(vararginargs)Console.WriteLine(arg);}}

Java

An example of Java argument parsing would be:

publicclassEcho{publicstaticvoidmain(String[]args){for(Strings:args){System.out.println(s);}}}

Kotlin

Here are some possible ways to print arguments in Kotlin: [3]

funmain(args:Array<String>)=println(args.joinToString())
funmain(args:Array<String>)=println(args.contentToString())
funmain(args:Array<String>){for(arginargs)println(arg)}

Perl

Perl uses @ARGV.

foreach$arg(@ARGV)GT{print$arg;}

FT

or

foreach$argnum(0..$#ARGV)ST{print$ARGV[$argnum];}

AWK

AWK uses ARGV also.

BEGIN{for(i=0;i<ARGC;i++){printARGV[i]}}

PHP

PHP uses argc as a count of arguments and argv as an array containing the values of the arguments. [4] [5] To create an array from command-line arguments in the -foo:bar format, the following might be used:

$args=parseArgs($argv);echogetArg($args,"foo");functionparseArgs(array$args){foreach($argsas$arg){$tmp=explode(":",$arg,2);if($arg[0]==="-"){$args[substr($tmp[0],1)]=$tmp[1];}}return$args;}functiongetArg(array$args,string$arg){if(isset($args[$arg])){return$args[$arg];}returnfalse;}

PHP can also use getopt(). [6]

Python

Python uses sys.argv, e.g.:

importsysforarginsys.argv:printarg

Python also has a module called argparse in the standard library for parsing command-line arguments. [7]

Racket

Racket uses a current-command-line-arguments parameter, and provides a racket/cmdline [8] library for parsing these arguments. Example:

#lang racket(requireracket/cmdline)(definesmile?(make-parameter#t))(definenose?(make-parameter#false))(defineeyes(make-parameter":"))(command-line#:program"emoticon"#:once-any; the following two are mutually exclusive[("-s""--smile")"smile mode"(smile?#true)][("-f""--frown")"frown mode"(smile?#false)]#:once-each[("-n""--nose")"add a nose"(nose?#true)][("-e""--eyes")char"use <char> for the eyes"(eyeschar)])(printf"~a~a~a\n"(eyes)(if(nose?)"-""")(if(smile?)")""("))

The library parses long and short flags, handles arguments, allows combining short flags, and handles -h and --help automatically:

$ racket/tmp/c-nfe88-(

Rexx

Rexx uses arg, e.g.:

doi=1towords(arg(1))sayword(arg(1),i)end

Rust

The args are in env::args(). [9]

usestd::env;fnmain(){letargs: Vec<String>=env::args().collect();letquery=&args[1];letfile_path=&args[2];println!("Searching for {}",query);println!("In file {}",file_path);}

Node.js

JavaScript programs written for Node.js use the process.argv global variable. [10]

// argv.jsconsole.log(process.argv);
$nodeargv.jsonetwothreefourfive ['node', '/home/avian/argvdemo/argv.js', 'one', 'two', 'three', 'four', 'five']

Node.js programs are invoked by running the interpreter node interpreter with a given file, so the first two arguments will be node and the name of the JavaScript source file. It is often useful to extract the rest of the arguments by slicing a sub-array from process.argv. [11]

// process-args.jsconsole.log(process.argv.slice(2));
$nodeprocess-args.jsonetwo=threefour ['one', 'two=three', 'four']

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

<span class="mw-page-title-main">GNU Debugger</span> Source-level debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.

<span class="mw-page-title-main">Quine (computing)</span> Self-replicating program

A quine is a computer program that takes no input and produces a copy of its own source code as its only output. The standard terms for these programs in the computability theory and computer science literature are "self-replicating programs", "self-reproducing programs", and "self-copying programs".

In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

<span class="mw-page-title-main">Foreach loop</span> Control flow statement for traversing items in a collection

In computer programming, foreach loop is a control flow statement for traversing items in a collection. foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages, an iterator, even if implicit, is often used as the means of traversal.

In most computer programming languages, a do while loop is a control flow statement that executes a block of code and then either repeats the block or exits the loop depending on a given boolean condition.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

In mathematics and in computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

getopts is a built-in Unix shell command for parsing command-line arguments. It is designed to process command line arguments that follow the POSIX Utility Syntax Guidelines, based on the C interface of getopt.

Getopt is a C library function used to parse command-line options of the Unix/POSIX style. It is a part of the POSIX specification, and is universal to Unix-like systems. It is also the name of a Unix program for parsing command line arguments in shell scripts.

Charm is a computer programming language devised in the early 1990s with similarities to the RTL/2, Pascal and C languages in addition to containing some unique features of its own. The Charm language is defined by a context-free grammar amenable to being processed by recursive descent parser as described in seminal books on compiler design.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

re2c is a free and open-source lexer generator for C, C++, Go, and Rust. It compiles declarative regular expression specifications to deterministic finite automata. Originally written by Peter Bumbulis and described in his paper, re2c was put in public domain and has been since maintained by volunteers. It is the lexer generator adopted by projects such as PHP, SpamAssassin, Ninja build system and others. Together with the Lemon parser generator, re2c is used in BRL-CAD. This combination is also used with STEPcode, an implementation of ISO 10303 standard.

References

  1. "The C Book — Arguments to main". Publications.gbdirect.co.uk. Retrieved 2010-05-31.
  2. An example of parsing C arguments and options
  3. "Kotlin: Basic syntax" . Retrieved 2022-05-13.
  4. "PHP Manual". PHP. Retrieved 2010-05-31.
  5. wikibooks:PHP Programming/CLI
  6. "PHP: Getopt - Manual".
  7. "argparse — Parser for command-line options, arguments and sub-commands". Python v3.10.0 documentation. Archived from the original on 2012-11-01. Retrieved 15 October 2021.
  8. The Racket reference manual, Command-Line Parsing
  9. "Accepting Command Line Arguments - The Rust Programming Language". doc.rust-lang.org. Retrieved 22 December 2022.
  10. "process.argv". Node.js v10.16.3 Documentation. Retrieved 3 October 2019.
  11. "How to parse command line arguments". Node.js Foundation Documentation. Retrieved 3 October 2019.