Formal methods

Last updated

In computer science, specifically software engineering and hardware engineering, formal methods are a particular kind of mathematically based technique for the specification, development and verification of software and hardware systems. [1] The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analysis can contribute to the reliability and robustness of a design. [2]

Computer science Study of the theoretical foundations of information and computation

Computer science is the study of processes that interact with data and that can be represented as data in the form of programs. It enables the use of algorithms to manipulate, store, and communicate digital information. A computer scientist studies the theory of computation and the practice of designing software systems.

Computer science is no more about computers than astronomy is about telescopes.

Software engineering is the application of engineering to the development of software in a systematic method.

In computer science, formal specifications are mathematically based techniques whose purpose are to help with the implementation of systems and software. They are used to describe a system, to analyze its behavior, and to aid in its design by verifying key properties of interest through rigorous and effective reasoning tools. These specifications are formal in the sense that they have a syntax, their semantics fall within one domain, and they are able to be used to infer useful information.

Contents

Formal methods are best described as the application of a fairly broad variety of theoretical computer science fundamentals, in particular logic calculi, formal languages, automata theory, discrete event dynamic system and program semantics, but also type systems and algebraic data types to problems in software and hardware specification and verification. [3]

Theoretical computer science subfield of computer science and of mathematics

Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on more mathematical topics of computing and includes the theory of computation.

Logic in computer science Academic discipline

Logic in computer science covers the overlap between the field of logic and that of computer science. The topic can essentially be divided into three main areas:

Formal language set of strings of symbols that may be constrained by rules that are specific to it

In mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.

Background

Semi-Formal Methods are formalisms and languages that are not considered fully “formal”. It defers the task of completing the semantics to a later stage, which is then done either by human interpretation or by interpretation through software like code or test case generators. [4]

Taxonomy

Formal methods can be used at a number of levels:

Level 0: Formal specification may be undertaken and then a program developed from this informally. This has been dubbed formal methods lite. This may be the most cost-effective option in many cases.

Level 1: Formal development and formal verification may be used to produce a program in a more formal manner. For example, proofs of properties or refinement from the specification to a program may be undertaken. This may be most appropriate in high-integrity systems involving safety or security.

In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.

Safety state of being secure from harm, injury, danger or risk

Safety is the state of being "safe", the condition of being protected from harm or other non-desirable outcomes. Safety can also refer to the control of recognized hazards in order to achieve an acceptable level of risk.

Security Degree of resistance to, or protection from, harm

Security is freedom from, or resilience against, potential harm caused by others. Beneficiaries of security may be of persons and social groups, objects and institutions, ecosystems or any other entity or phenomenon vulnerable to unwanted change by its environment.

Level 2: Theorem provers may be used to undertake fully formal machine-checked proofs. This can be very expensive and is only practically worthwhile if the cost of mistakes is extremely high (e.g., in critical parts of microprocessor design).

Further information on this is expanded below.

As with programming language semantics, styles of formal methods may be roughly classified as follows:

Lightweight formal methods

Some practitioners believe that the formal methods community has overemphasized full formalization of a specification or design. [5] [6] They contend that the expressiveness of the languages involved, as well as the complexity of the systems being modelled, make full formalization a difficult and expensive task. As an alternative, various lightweight formal methods, which emphasize partial specification and focused application, have been proposed. Examples of this lightweight approach to formal methods include the Alloy object modelling notation, [7] Denney's synthesis of some aspects of the Z notation with use case driven development, [8] and the CSK VDM Tools. [9]

Uses

Formal methods can be applied at various points through the development process.

Specification

Formal methods may be used to give a description of the system to be developed, at whatever level(s) of detail desired. This formal description can be used to guide further development activities (see following sections); additionally, it can be used to verify that the requirements for the system being developed have been completely and accurately specified. or formalising system requirements by expressing them in a formal language with a precise and unambiguously defined syntax and semantics.

The need for formal specification systems has been noted for years. In the ALGOL 58 report, [10] John Backus presented a formal notation for describing programming language syntax, later named Backus normal form then renamed Backus–Naur form (BNF). [11] Backus also wrote that a formal description of the meaning of syntactically valid ALGOL programs wasn't completed in time for inclusion in the report. "Therefore the formal treatment of the semantics of legal programs will be included in a subsequent paper." It never appeared.

Development

Formal development using formal methods as an integrated part of a tool-supported system development process.

Once a formal specification has been produced, the specification may be used as a guide while the concrete system is developed during the design process (i.e., realized typically in software, but also potentially in hardware). For example:

Verification

Formal verification using a software tool to prove properties of a formal specification, or that a formal model of a system implementation satisfies its specification.

Once a formal specification has been developed, the specification may be used as the basis for proving properties of the specification (and hopefully by inference the developed system).

Sign-off verification

A formal verification tool for sign-off verification is a tool that is highly trusted such that it can replace traditional verification methods (the tool may even be certified).

Human-directed proof

Sometimes, the motivation for proving the correctness of a system is not the obvious need for reassurance of the correctness of the system, but a desire to understand the system better. Consequently, some proofs of correctness are produced in the style of mathematical proof: handwritten (or typeset) using natural language, using a level of informality common to such proofs. A "good" proof is one which is readable and understandable by other human readers.

Critics of such approaches point out that the ambiguity inherent in natural language allows errors to be undetected in such proofs; often, subtle errors can be present in the low-level details typically overlooked by such proofs. Additionally, the work involved in producing such a good proof requires a high level of mathematical sophistication and expertise.

Automated proof

In contrast, there is increasing interest in producing proofs of correctness of such systems by automated means. Automated techniques fall into three general categories:

  • Automated theorem proving, in which a system attempts to produce a formal proof from scratch, given a description of the system, a set of logical axioms, and a set of inference rules.
  • Model checking, in which a system verifies certain properties by means of an exhaustive search of all possible states that a system could enter during its execution.
  • Abstract interpretation, in which a system verifies an over-approximation of a behavioural property of the program, using a fixpoint computation over a (possibly complete) lattice representing it.

Some automated theorem provers require guidance as to which properties are "interesting" enough to pursue, while others work without human intervention. Model checkers can quickly get bogged down in checking millions of uninteresting states if not given a sufficiently abstract model.

Proponents of such systems argue that the results have greater mathematical certainty than human-produced proofs, since all the tedious details have been algorithmically verified. The training required to use such systems is also less than that required to produce good mathematical proofs by hand, making the techniques accessible to a wider variety of practitioners.

Critics note that some of those systems are like oracles: they make a pronouncement of truth, yet give no explanation of that truth. There is also the problem of "verifying the verifier"; if the program which aids in the verification is itself unproven, there may be reason to doubt the soundness of the produced results. Some modern model checking tools produce a "proof log" detailing each step in their proof, making it possible to perform, given suitable tools, independent verification.

The main feature of the abstract interpretation approach is that it provides a sound analysis, i.e. no false negatives are returned. Moreover, it is efficiently scalable, by tuning the abstract domain representing the property to be analyzed, and by applying widening operators [12] to get fast convergence.

Applications

Formal methods are applied in different areas of hardware and software, including routers, Ethernet switches, routing protocols, security applications, and operating system microkernels such as seL4. There are several examples in which they have been used to verify the functionality of the hardware and software used in DCs[ clarification needed ]. IBM used ACL2, a theorem prover, in AMD x86 processor development process[ citation needed ]. Intel uses such methods to verify its hardware and firmware (permanent software programmed into a read-only memory)[ citation needed ]. Dansk Datamatik Center used formal methods in the 1980s to develop a compiler system for the Ada programming language that went on to become a long-lived commercial product. [13] [14]

There are several other projects of NASA in which formal methods are applied, such as Next Generation Air Transportation System [ citation needed ], Unmanned Aircraft System integration in National Airspace System, [15] and Airborne Coordinated Conflict Resolution and Detection (ACCoRD). [16] B-Method with AtelierB, [17] is used to develop safety automatisms for the various subways installed throughout the world by Alstom and Siemens, and also for Common Criteria certification and the development of system models by ATMEL and STMicroelectronics.

Formal verification has been frequently used in hardware by most of the well-known hardware vendors, such as IBM, Intel, and AMD. There are many areas of hardware, where Intel have used FMs to verify the working of the products, such as parameterized verification of cache coherent protocol, [18] Intel Core i7 processor execution engine validation [19] (using theorem proving, BDDs, and symbolic evaluation), optimization for Intel IA-64 architecture using HOL light theorem prover, [20] and verification of high performance dual-port gigabit Ethernet controller with a support for PCI express protocol and Intel advance management technology using Cadence. [21] Similarly, IBM has used formal methods in the verification of power gates, [22] registers, [23] and functional verification of the IBM Power7 microprocessor. [24]

In software development

In software development, formal methods are mathematical approaches to solving software (and hardware) problems at the requirements, specification, and design levels. Formal methods are most likely to be applied to safety-critical or security-critical software and systems, such as avionics software. Software safety assurance standards, such as DO-178C allows the usage of formal methods through supplementation, and Common Criteria mandates formal methods at the highest levels of categorization.

For sequential software, examples of formal methods include the B-Method, the specification languages used in automated theorem proving, RAISE, and the Z notation.

In functional programming, property-based testing has allowed the mathematical specification and testing (if not exhaustive testing) of the expected behaviour of individual functions.

The Object Constraint Language (and specializations such as Java Modeling Language) has allowed object-oriented systems to be formally specified, if not necessarily formally verified.

For concurrent software and systems, Petri nets, process algebra, and finite state machines (which are based on automata theory - see also virtual finite state machine or event driven finite state machine) allow executable software specification and can be used to build up and validate application behavior.

Another approach to formal methods in software development is to write a specification in some form of logicusually a variation of first-order logic (FOL)and then to directly execute the logic as though it were a program. The OWL language, based on Description Logic (DL), is an example. There is also work on mapping some version of English (or another natural language) automatically to and from logic, and executing the logic directly. Examples are Attempto Controlled English, and Internet Business Logic, which do not seek to control the vocabulary or syntax. A feature of systems that support bidirectional English-logic mapping and direct execution of the logic is that they can be made to explain their results, in English, at the business or scientific level.[ citation needed ]

Formal methods and notations

There are a variety of formal methods and notations available.

Specification languages

Model checkers

See also

Related Research Articles

Automated theorem proving is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a major impetus for the development of computer science.

Z notation

The Z notation is a formal specification language used for describing and modelling computing systems. It is targeted at the clear specification of computer programs and computer-based systems in general.

In computer engineering, a hardware description language (HDL) is a specialized computer language used to describe the structure and behavior of electronic circuits, and most commonly, digital logic circuits.

The Vienna Development Method (VDM) is one of the longest-established formal methods for the development of computer-based systems. Originating in work done at the IBM Laboratory Vienna in the 1970s, it has grown to include a group of techniques and tools based on a formal specification language—the VDM Specification Language (VDM-SL). It has an extended form, VDM++, which supports the modeling of object-oriented and concurrent systems. Support for VDM includes commercial and academic tools for analyzing models, including support for testing and proving properties of models and generating program code from validated VDM models. There is a history of industrial usage of VDM and its tools and a growing body of research in the formalism has led to notable contributions to the engineering of critical systems, compilers, concurrent systems and in logic for computer science.

Isabelle (proof assistant) software

The Isabelle automated theorem prover is an interactive theorem prover, a higher order logic (HOL) theorem prover. It is an LCF-style theorem prover, so it is based on a small logical core to ease logical correctness. Isabelle is generic: it provides a meta-logic, which is used to encode object logics like first-order logic (FOL), higher-order logic (HOL) or Zermelo–Fraenkel set theory (ZFC). Isabelle's main proof method is a higher-order version of resolution, based on higher-order unification. Though interactive, Isabelle also features efficient automatic reasoning tools, such as a term rewriting engine and a tableaux prover, as well as various decision procedures.

HOL denotes a family of interactive theorem proving systems using similar (higher-order) logics and implementation strategies. Systems in this family follow the LCF approach as they are implemented as a library in some programming language. This library implements an abstract data type of proven theorems so that new objects of this type can only be created using the functions in the library which correspond to inference rules in higher-order logic. As long as these functions are correctly implemented, all theorems proven in the system must be valid. In this way, a large system can be built on top of a small trusted kernel.

ACL2 software system consisting of a programming language, an extensible theory in a first-order logic, and an automated theorem prover

ACL2 is a software system consisting of a programming language, an extensible theory in a first-order logic, and an automated theorem prover. ACL2 is designed to support automated reasoning in inductive logical theories, mostly for the purpose of software and hardware verification. The input language and implementation of ACL2 are written in Common Lisp. ACL2 is free and open-source software.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

In computer science, model checking, or property checking, is, for a given finite-state model of a system, exhaustively and automatically checking whether this model meets a given specification. Typically, one has hardware or software systems in mind, whereas the specification contains safety requirements such as the absence of deadlocks and similar critical states that can cause the system to crash, as well as liveness requirements.

A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in the structure.

A formal system is used to infer theorems from axioms according to a set of rules. These rules used to carry out the inference of theorems from axioms are known as the logical calculus of the formal system. A formal system is essentially an "axiomatic system". In 1921, David Hilbert proposed to use such system as the foundation for the knowledge in mathematics. A formal system may represent a well-defined system of abstract thought.

The B method is a method of software development based on B, a tool-supported formal method based on an abstract machine notation, used in the development of computer software. It was originally developed in the 1980s by Jean-Raymond Abrial in France and the UK. B is related to the Z notation and supports development of programming language code from specifications. B has been used in major safety-critical system applications in Europe. It has robust, commercially available tool support for specification, design, proof and code generation.

Automated reasoning is an area of computer science, cognitive science, and mathematical logic dedicated to understanding different aspects of reasoning. The study of automated reasoning helps produce computer programs that allow computers to reason completely, or nearly completely, automatically. Although automated reasoning is considered a sub-field of artificial intelligence, it also has connections with theoretical computer science, and even philosophy.

Programming language theory branch of computer science that deals with the design, implementation, analysis, characterization, and classification of programming languages and their individual features

Programming language theory (PLT) is a branch of computer science that deals with the design, implementation, analysis, characterization, and classification of programming languages and their individual features. It falls within the discipline of computer science, both depending on and affecting mathematics, software engineering, linguistics and even cognitive science. It is a well-recognized branch of computer science, and an active research area, with results published in numerous journals dedicated to PLT, as well as in general computer science and engineering publications.

The Meta-IV was an early version of the specification language of the Vienna Development Method formal method for the development of computer-based systems.

LePUS3

LePUS3 is a language for modelling and visualizing object-oriented programs and design patterns. It is defined as a formal specification language, formulated as an axiomatized subset of First-order predicate logic. A diagram in LePUS3 is also called a Codechart. LePUS, the name of the first version of the language, is an abbreviation for Language for Pattern Uniform Specification.

References

  1. R. W. Butler (2001-08-06). "What is Formal Methods?" . Retrieved 2006-11-16.
  2. C. Michael Holloway. "Why Engineers Should Consider Formal Methods" (PDF). 16th Digital Avionics Systems Conference (27–30 October 1997). Archived from the original (PDF) on 16 November 2006. Retrieved 2006-11-16.
  3. Monin, pp.3-4
  4. X2R-2, deliverable D5.1.
  5. Daniel Jackson and Jeannette Wing, "Lightweight Formal Methods", IEEE Computer, April 1996
  6. Vinu George and Rayford Vaughn, "Application of Lightweight Formal Methods in Requirement Engineering" Archived 2006-03-01 at the Wayback Machine , Crosstalk: The Journal of Defense Software Engineering, January 2003
  7. Daniel Jackson, "Alloy: A Lightweight Object Modelling Notation", ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 11, Issue 2 (April 2002), pp. 256-290
  8. Richard Denney, Succeeding with Use Cases: Working Smart to Deliver Quality, Addison-Wesley Professional Publishing, 2005, ISBN   0-321-31643-6.
  9. Sten Agerholm and Peter G. Larsen, "A Lightweight Approach to Formal Methods" Archived 2006-03-09 at the Wayback Machine , In Proceedings of the International Workshop on Current Trends in Applied Formal Methods, Boppard, Germany, Springer-Verlag, October 1998
  10. Backus, J.W. (1959). "The Syntax and Semantics of the Proposed International Algebraic Language of Zürich ACM-GAMM Conference". Proceedings of the International Conference on Information Processing. UNESCO.
  11. Knuth, Donald E. (1964), Backus Normal Form vs Backus Naur Form. Communications of the ACM , 7(12):735–736.
  12. A.Cortesi and M.Zanioli, Widening and Narrowing Operators for Abstract Interpretation. Computer Languages, Systems and Structures. Volume 37(1), pp. 24–42, Elsevier, ISSN   1477-8424 (2011).
  13. Bjørner, Dines; Gram, Christian; Oest, Ole N.; Rystrøm, Leif (2011). "Dansk Datamatik Center". In Impagliazzo, John; Lundin, Per; Wangler, Benkt (eds.). History of Nordic Computing 3: IFIP Advances in Information and Communication Technology. Springer. pp. 350–359.
  14. Bjørner, Dines; Havelund, Klaus. "40 Years of Formal Methods: Some Obstacles and Some Possibilities?". FM 2014: Formal Methods: 19th International Symposium, Singapore, May 12–16, 2014. Proceedings (PDF). Springer. pp. 42–61.
  15. Gheorghe, A. V., & Ancel, E. (2008, November). Unmanned aerial systems integration to National Airspace System. In Infrastructure Systems and Services: Building Networks for a Brighter Future (INFRA), 2008 First International Conference on (pp. 1-5). IEEE.
  16. Airborne Coordinated Conflict Resolution and Detection, http://shemesh.larc.nasa.gov/people/cam/ACCoRD/
  17. website : http://www.atelierb.eu/en/
  18. C. T. Chou, P. K. Mannava, S. Park, “A simple method for parameterized verification of cache coherence protocols”, Formal Methods in Computer-Aided Design, pp. 382–398, 2004.
  19. Formal Verification in Intel® Core™ i7 Processor Execution Engine Validation, http://cps-vo.org/node/1371, accessed at Sep. 13, 2013.
  20. J. Grundy, “Verified optimizations for the Intel IA-64 architecture”, In Theorem Proving in Higher Order Logics, Springer Berlin Heidelberg, 2004, pp. 215–232.
  21. E. Seligman, I. Yarom, “Best known methods for using Cadence Conformal LEC”, at Intel.
  22. C. Eisner, A. Nahir, K. Yorav, “Functional verification of power gated designs by compositional reasoning”, Computer Aided Verification Springer Berlin Heidelberg, pp. 433–445.
  23. P. C. Attie, H. Chockler, “Automatic verification of fault-tolerant register emulations”, Electronic Notes in Theoretical Computer Science, vol. 149, no. 1, pp. 49–60.
  24. K. D. Schubert, W. Roesner, J. M. Ludden, J. Jackson, J. Buchert, V. Paruthi, B. Brock, “Functional verification of the IBM POWER7 microprocessor and POWER7 multiprocessor systems”, IBM Journal of Research and Development, vol. 55, no 3.

Further reading