MPS (format)

Last updated

MPS (Mathematical Programming System) is a file format for presenting and archiving linear programming (LP) and mixed integer programming problems.

Contents

Overview

NEOS input statistics for January 2011. Neos-stats-2011-01.png
NEOS input statistics for January 2011.

The format was named after an early IBM LP product [1] and has emerged as a de facto standard ASCII medium among most of the commercial LP solvers. Essentially all commercial LP solvers accept this format, and it is also accepted by the open-source COIN-OR system. Other software may require a customized reader routine in order to read MPS files. However, with the acceptance of algebraic modeling languages MPS usage has declined. For example, according to the NEOS server statistics in January 2011 less than 1% of submissions were in MPS form compared to 59.4% of AMPL and 29.7% of GAMS submissions.

MPS is column-oriented (as opposed to entering the model as equations), and all model components (variables, rows, etc.) receive names. MPS is an old format, so it is set up for punch cards: Fields start in column 2, 5, 15, 25, 40 and 50. Sections of an MPS file are marked by so-called header cards, which are distinguished by their starting in column 1. Although it is typical to use upper-case throughout the file for historical reasons, many MPS-readers will accept mixed-case for anything except the header cards, and some allow mixed-case anywhere. The names that you choose for the individual entities (constraints or variables) are not important to the solver; one should pick meaningful names, or easy names for a post-processing code to read.

MPS format

Here is a little sample model written in MPS format (explained in more detail below):

NAME          TESTPROB ROWS  N  COST  L  LIM1  G  LIM2  E  MYEQN COLUMNS     XONE      COST                 1   LIM1                 1     XONE      LIM2                 1     YTWO      COST                 4   LIM1                 1     YTWO      MYEQN               -1     ZTHREE    COST                 9   LIM2                 1     ZTHREE    MYEQN                1 RHS     RHS1      LIM1                 5   LIM2                10     RHS1      MYEQN                7 BOUNDS  UP BND1      XONE                 4  LO BND1      YTWO                -1  UP BND1      YTWO                 1 ENDATA

For comparison, here is the same model written out in an equation-oriented format:

Optimize  COST:    XONE + 4*YTWO + 9*ZTHREE Subject To  LIM1:    XONE + YTWO          <= 5  LIM2:    XONE        + ZTHREE >= 10  MYEQN:        - YTWO + ZTHREE  = 7 Bounds        XONE <= 4  -1 <= YTWO <= 1 End

As mentioned below, the lower bound on XONE is either zero or -infinity, depending upon implementation, because it is not specified. Strangely, nothing in MPS format specifies the direction of optimization, and there is no standard "default" direction; some LP solvers will maximize if not instructed otherwise, others will minimize, [2] and still others put safety first and have no default and require a selection somewhere in a control program or by a calling parameter. If the model is formulated for minimization and the solver requires maximization (or vice versa), it is easy to convert between the two by negating all coefficients of the objective function. The optimal value of the objective function will then be the negative of the original optimal value, but the values of the variables themselves will be correct. Some programs support specifying minimization/maximization within the MPS file.

OBJSENSE  MAX

Variables

The NAME record can have any value, starting in column 15.

The ROWS section defines the names of all the constraints; entries in column 2 or 3 are E for equality ( = ) rows, L for less-than ( <= ) rows, G for greater-than ( >= ) rows, and N for non-constraining rows. The order of the rows named in this section is unimportant, except for non-constraining rows marked N, the first of which would be interpreted as the objective function.

The COLUMNS section contains the entries of the A-matrix. All entries for a given column must be placed consecutively, although within a column the order of the entries (rows) is irrelevant. Rows not mentioned for a column are implied to have a coefficient of zero.

The RHS section allows one or more right-hand-side vectors to be defined; there is seldom more than one. In the above example, the name of the RHS vector is RHS1, and has non-zero values in all 3 of the constraint rows of the problem. Rows not mentioned in an RHS vector would be assumed to have a right-hand-side of zero.

The optional BOUNDS section specifies lower and upper bounds on individual variables, if they are not given by rows in the matrix. All the bounds that have a given name in column 5 are taken together as a set. Variables not mentioned in a given BOUNDS set are taken to be non-negative (lower bound zero, no upper bound). A bound of type UP means an upper bound is applied to the variable. A bound of type LO means a lower bound is applied. A bound type of FX ("fixed") means that the variable has upper and lower bounds equal to a single value. A bound type of FR ("free") means the variable has neither lower nor upper bounds and so can take on negative values. A variation on that is MI for free negative, giving an upper bound of 0 but no lower bound. Bound type PL is for a free positive from zero to plus infinity, but as this is the normal default, it is seldom used. There are also bound types for use in MIP models – BV for binary, being 0 or 1. UI for upper integer and LI for lower integer. SC stands for semi-continuous and indicates that the variable may be zero, but if not must be equal to at least the value given.

Another optional section called RANGES specifies double-inequalities, in a somewhat counterintuitive way not described here. Ways to mark integer variables are also beyond the scope of this article (keyword MARKER and possibly SOS are involved). The final card must be ENDATA (notice the odd spelling).

A few special cases of the MPS standard are not consistently handled by implementations. In the BOUNDS section, if a variable is given a nonpositive upper bound but no lower bound, its lower bound may default to zero or to minus infinity (also, if the upper bound is given as zero, the lower bound might be zero or negative infinity). [3] If an integer variable has no upper bound specified, its upper bound may default to one rather than to plus infinity.

Limitations

MPS has many limitations. It does not specify the direction of optimization which is handled differently by solvers. Numeric fields have 12 characters width therefore limiting the precision. The representation is neither easy for human interpretation nor compact (although reserves column / row order information, which is often beneficial for LP solver behaviour reproducibility). One of the alternatives to MPS that does not have its limitations and is supported by most solvers is the nl file format.

Extensions

Many LP products include extensions to the MPS format. The free format MPS allows for long names and more accurate data by allowing fields to exceed the columns defined by the original standard, and apply whitespaces as separators instead of fixed column positions (note that this makes some MPS files that included whitespaces as part of names to be no longer valid). Some extensions include adding new kind of data to the MPS file (e.g. sections to include objective sense, integrality requirements, quadratic data or advanced MIP modelling constructs). There is also a compressed MPSC file format. [4] SMPS [5] is a specialized extension, designed to represent stochastic programming problem instances, in use especially in research environments.

Although some extensions are not standardized, the format is still in general use.

See also

Related Research Articles

<span class="mw-page-title-main">Spreadsheet</span> Computer application for organization, analysis, and storage of data in tabular form

A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in cells of a table. Each cell may contain either numeric or text data, or the results of formulas that automatically calculate and display a value based on the contents of other cells. The term spreadsheet may also refer to one such electronic document.

<span class="mw-page-title-main">Linear programming</span> Method to solve optimization problems

Linear programming (LP), also called linear optimization, is a method to achieve the best outcome in a mathematical model whose requirements and objective are represented by linear relationships. Linear programming is a special case of mathematical programming.

<span class="mw-page-title-main">Sparse matrix</span> Matrix in which most of the elements are zero

In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse but a common criterion is that the number of non-zero elements is roughly equal to the number of rows or columns. By contrast, if most of the elements are non-zero, the matrix is considered dense. The number of zero-valued elements divided by the total number of elements is sometimes referred to as the sparsity of the matrix.

In mathematical optimization, Dantzig's simplex algorithm is a popular algorithm for linear programming.

The IEEE Standard for Floating-Point Arithmetic is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult to use reliably and portably. Many hardware floating-point units use the IEEE 754 standard.

An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints are linear.

Branch and bound is a method for solving optimization problems by breaking them down into smaller sub-problems and using a bounding function to eliminate sub-problems that cannot contain the optimal solution. It is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">IBM 1130</span> 16-bit IBM minicomputer introduced in 1965

The IBM 1130 Computing System, introduced in 1965, was IBM's least expensive computer at that time. A binary 16-bit machine, it was marketed to price-sensitive, computing-intensive technical markets, like education and engineering, succeeding the decimal IBM 1620 in that market segment. Typical installations included a 1 megabyte disk drive that stored the operating system, compilers and object programs, with program source generated and maintained on punched cards. Fortran was the most common programming language used, but several others, including APL, were available.

In computer programming, array slicing is an operation that extracts a subset of elements from an array and packages them as another array, possibly in a different dimension from the original.

Branch and cut is a method of combinatorial optimization for solving integer linear programs (ILPs), that is, linear programming (LP) problems where some or all the unknowns are restricted to integer values. Branch and cut involves running a branch and bound algorithm and using cutting planes to tighten the linear programming relaxations. Note that if cuts are only used to tighten the initial LP relaxation, the algorithm is called cut and branch.

This is an overview of Fortran 95 language features. Included are the additional features of TR-15581:Enhanced Data Type Facilities, which have been universally implemented. Old features that have been superseded by new ones are not described – few of those historic features are used in modern programs although most have been retained in the language to maintain backward compatibility. The current standard is Fortran 2023; many of its new features are still being implemented in compilers. The additional features of Fortran 2003, Fortran 2008, Fortran 2018 and Fortran 2023 are described by Metcalf, Reid, Cohen and Bader.

ORVYL is a time-sharing monitor developed by Stanford University for IBM System/360 and System/370 computers in 1967–68. ORVYL was one of the first time-sharing systems to be made available for IBM computers. Wylbur is a text editor and word processor program designed to work either without ORVYL, or in conjunction with ORVYL.

In computer science, the term range may refer to one of three things:

  1. The possible values that may be stored in a variable.
  2. The upper and lower bounds of an array.
  3. An alternative to iterator.
<span class="mw-page-title-main">COIN-OR</span>

Computational Infrastructure for Operations Research (COIN-OR), is a project that aims to "create for mathematical software what the open literature is for mathematical theory." The open literature provides the operations research (OR) community with a peer-review process and an archive. Papers in operations research journals on mathematical theory often contain supporting numerical results from computational studies. The software implementations, models, and data used to produce the numerical results are typically not published. The status quo impeded researchers needing to reproduce computational results, make fair comparisons, and extend the state of the art.

OptimJ is an extension for Java with language support for writing optimization models and abstractions for bulk data processing. The extensions and the proprietary product implementing the extensions were developed by Ateji which went out of business in September 2011. OptimJ aims at providing a clear and concise algebraic notation for optimization modeling, removing compatibility barriers between optimization modeling and application programming tools, and bringing software engineering techniques such as object-orientation and modern IDE support to optimization experts.

The GOFF specification was developed for IBM's MVS operating system to supersede the IBM OS/360 Object File Format to compensate for weaknesses in the older format.

Convex Over and Under ENvelopes for Nonlinear Estimation (Couenne) is an open-source library for solving global optimization problems, also termed mixed integer nonlinear optimization problems. A global optimization problem requires to minimize a function, called objective function, subject to a set of constraints. Both the objective function and the constraints might be nonlinear and nonconvex. For solving these problems, Couenne uses a reformulation procedure and provides a linear programming approximation of any nonconvex optimization problem.

ANTIGONE, is a deterministic global optimization solver for general Mixed-Integer Nonlinear Programs (MINLP).

The Hack Computer is a theoretical computer design created by Noam Nisan and Shimon Schocken and described in their book, The Elements of Computing Systems: Building a Modern Computer from First Principles.  In using the term “modern”, the authors refer to a digital, binary machine that is patterned according to the von Neumann architecture model.

References

  1. IBM, Optimization Subroutine Library, Guide and Reference, document SC23-0519, IBM[ dead link ]
  2. ILOG CPLEX 10.0 File Formats (PDF). January 2006. p. 28.{{cite book}}: |work= ignored (help)
  3. IBM CPLEX document
  4. "EMPS – Expand a compressed MPS file". People.sc.fsu.edu. 2005-08-31. Archived from the original on 2012-12-23. Retrieved 2013-01-22.
  5. "The SMPS format for stochastic linear programs". Myweb.dal.ca. 2006-07-11. Archived from the original on 2013-06-28. Retrieved 2014-05-28.