Lex (software)

Lex
Original author(s)	Mike Lesk, Eric Schmidt
Initial release	1975;50 years ago
Repository	minnie.tuhs.org/cgi-bin/utree.pl?file=4BSD%2Fusr%2Fsrc%2Fcmd%2Flex ;
Written in	C
Operating system	Unix, Unix-like, Plan 9
Platform	Cross-platform
Type	Command
License	Plan 9: MIT License

Last updated April 11, 2025

Lex is a computer program that generates lexical analyzers ("scanners" or "lexers").^[1]^[2] It is commonly used with the yacc parser generator and is the standard lexical analyzer generator on many Unix and Unix-like systems. An equivalent tool is specified as part of the POSIX standard.^[3]

History

Lex was originally written by Mike Lesk and Eric Schmidt ^[5] and described in 1975.^[6]^[7] In the following years, Lex became standard lexical analyzer generator on many Unix and Unix-like systems. In 1983, Lex was one of several UNIX tools available for Charles River Data Systems' UNOS operating system under Bell Laboratories license.^[8] Although originally distributed as proprietary software, some versions of Lex are now open-source. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as OpenSolaris and Plan 9 from Bell Labs. One popular open-source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding.

Structure of a Lex file

The structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows:

The definitions section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time.

Example of a Lex file

The following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out.

/*** Definition section ***/%{/* C code to be copied verbatim */#include<stdio.h>%}%%/*** Rules section ***//* [0-9]+ matches a string of one or more digits */[0-9]+{/* yytext is a string containing the matched text. */printf("Saw an integer: %s\n",yytext);}.|\n{/* Ignore all other characters. */}%%/*** C Code section ***/intmain(void){/* Call the lexer, then quit. */yylex();return0;}

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input:

abc123z.!&*2gj6

the program will print:

Saw an integer: 123 Saw an integer: 2 Saw an integer: 6

Using Lex with other programming tools

Using Lex with parser generators

Lex, as with other lexical analyzers, limits rules to those which can be described by regular expressions. Due to this, Lex can be implemented by a finite-state automata as shown by the Chomsky hierarchy of languages. To recognize more complex languages, Lex is often used with parser generators such as Yacc or Bison. Parser generators use a formal grammar to parse an input stream.

It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream.

Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer.

Lex and make

make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.^[9]

References

↑ Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2 ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7.
↑ Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1.
↑ The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex
↑ John R. Levine; John Mason; Doug Brown (1992). Lex & Yacc . O'Reilly. ISBN 9781565920002.
↑ Lesk, M.E.; Schmidt, E. "Lex – A Lexical Analyzer Generator". Archived from the original on 2012-07-28. Retrieved August 16, 2010.
↑ Lesk, M.E.; Schmidt, E. (July 21, 1975). "Lex – A Lexical Analyzer Generator" (PDF). UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B. bell-labs.com. Retrieved Dec 20, 2011.
↑ Lesk, M.E. (October 1975). "Lex – A Lexical Analyzer Generator". Comp. Sci. Tech. Rep. No. 39. Murray Hill, New Jersey: Bell Laboratories.
↑ The Insider's Guide To The Universe (PDF). Charles River Data Systems, Inc. 1983. p. 13.
↑ "make". The Open Group Base Specifications (6). The IEEE and The Open Group. 2004. IEEE Std 1003.1, 2004 Edition.

External links

Using Flex and Bison at Macworld.com
lex(1) – Solaris 11.4 User Commands Reference Manual
lex(1) – Plan 9 Programmer's Manual, Volume 1

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2 ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7.

[2] Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1.

[3] The Open Group Base Specifications Issue 7, 2018 edition § Shell & Utilities § Utilities § lex

[4] John R. Levine; John Mason; Doug Brown (1992). Lex & Yacc . O'Reilly. ISBN 9781565920002.

[5] Lesk, M.E.; Schmidt, E. "Lex – A Lexical Analyzer Generator". Archived from the original on 2012-07-28. Retrieved August 16, 2010.

[6] Lesk, M.E.; Schmidt, E. (July 21, 1975). "Lex – A Lexical Analyzer Generator" (PDF). UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B. bell-labs.com. Retrieved Dec 20, 2011.

[7] Lesk, M.E. (October 1975). "Lex – A Lexical Analyzer Generator". Comp. Sci. Tech. Rep. No. 39. Murray Hill, New Jersey: Bell Laboratories.

[8] The Insider's Guide To The Universe (PDF). Charles River Data Systems, Inc. 1983. p. 13.

[9] "make". The Open Group Base Specifications (6). The IEEE and The Open Group. 2004. IEEE Std 1003.1, 2004 Edition.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Unix command-line interface programs and shell builtins
File system	cat chattr chmod chown chgrp cksum cmp cp dd du df file fuser ln ls mkdir mv pax pwd rm rmdir split tee touch type umask
Processes	at bg crontab fg kill nice ps time
User environment	env exit logname mesg talk tput uname who write
Text processing	awk basename comm csplit cut diff dirname ed ex fold head iconv join m4 more nl paste patch printf read sed sort strings tail tr troff uniq vi wc xargs
Shell builtins	alias cd echo test unset wait
Searching	find grep
Documentation	man
Software development	ar ctags lex make nm strip yacc
Miscellaneous	bc cal expr lp od sleep true and false
Categories Standard Unix programs Unix SUS2008 utilities List

v t e Plan 9 command-line interface programs and shell builtins
File system	chmod chgrp cmp cp dd du file gzip ls mkdir pwd rm split tee touch
Processes	kill ps
User environment	passwd who
Text processing	awk basename comm diff ed eqn join sed sort spell strings tail tr troff uniq wc
Shell builtins	echo test
Networking	ip/ipconfig ip/ping netstat
Searching	grep
Software development	ar hoc lex nm strip yacc
Miscellaneous	bc cal fortune sleep
Category