Software diversity

Last updated September 02, 2023

Software diversity is a research field about the comprehension and engineering of diversity in the context of software.

Areas

The different areas of software diversity are discussed in surveys on diversity for fault-tolerance^[1] or for security.^[2]^[3]

The main areas are:

design diversity, n-version programming, data diversity for fault tolerance
randomization
software variability ^[4]

Techniques

Code transformations

It is possible to amplify software diversity through automated transformation processes that create synthetic diversity. A "multicompiler" is compiler embedding a diversification engine.^[5] A multi-variant execution environment (MVEE) is responsible for selecting the variant to execute and compare the output.^[6]

Fred Cohen was among the very early promoters of such an approach. He proposed a series of rewriting and code reordering transformations that aim at producing massive quantities of different versions of operating systems functions.^[7] These ideas have been developed over the years and have led to the construction of integrated obfuscation schemes to protect key functions in large software systems.^[8]

Another approach to increase software diversity of protection consists in adding randomness in certain core processes, such as memory loading. Randomness implies that all versions of the same program run differently from each other, which in turn creates a diversity of program behaviors. This idea was initially proposed and experimented by Stephanie Forrest and her colleagues.^[9]

Recent work on automatic software diversity explores different forms of program transformations that slightly vary the behavior of programs. The goal is to evolve one program into a population of diverse programs that all provide similar services to users, but with a different code.^[10] This diversity of code enhances the protection of users against one single attack that could crash all programs at the same time.

Transformation operators include:^[11]

code layout randomization: reorder functions in code
globals layout randomization: reorder and pad globals
stack variable randomization: reorder variables in each stack frame
heap layout randomization

Natural software diversity

It is known that some functionalities are available in multiple interchangeable implementations. This natural diversity can be exploited, for example it has been shown valuable to increase security in cloud systems.^[12]

Related Research Articles

Leslie B. Lamport is an American computer scientist and mathematician. Lamport is best known for his seminal work in distributed systems, and as the initial developer of the document preparation system LaTeX and the author of its first manual.

<span class="mw-page-title-main">Multi-agent system</span> Built of multiple interacting agents

A multi-agent system is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning.

A Byzantine fault is a condition of a computer system, particularly distributed computing systems, where components may fail and there is imperfect information on whether a component has failed. The term takes its name from an allegory, the "Byzantine generals problem", developed to describe a situation in which, to avoid catastrophic failure of the system, the system's actors must agree on a concerted strategy, but some of these actors are unreliable.

Shlomi Dolev is a Rita Altura Trust Chair Professor in Computer Science at Ben-Gurion University of the Negev (BGU) and the head of the BGU Negev Hi-Tech Faculty Startup Accelerator.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

End-user development (EUD) or end-user programming (EUP) refers to activities and tools that allow end-users – people who are not professional software developers – to program computers. People who are not professional developers can use EUD tools to create or modify software artifacts and complex data objects without significant knowledge of a programming language. In 2005 it was estimated that by 2012 there would be more than 55 million end-user developers in the United States, compared with fewer than 3 million professional programmers. Various EUD approaches exist, and it is an active research topic within the field of computer science and human-computer interaction. Examples include natural language programming, spreadsheets, scripting languages, visual programming, trigger-action programming and programming by example.

In data analysis, anomaly detection is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behaviour. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data.

Thread Level Speculation (TLS), also known as Speculative Multithreading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

Search-based software engineering (SBSE) applies metaheuristic search techniques such as genetic algorithms, simulated annealing and tabu search to software engineering problems. Many activities in software engineering can be stated as optimization problems. Optimization techniques of operations research such as linear programming or dynamic programming are often impractical for large scale software engineering problems because of their computational complexity or their assumptions on the problem structure. Researchers and practitioners use metaheuristic search techniques, which impose little assumptions on the problem structure, to find near-optimal or "good-enough" solutions.

ProVerif is a software tool for automated reasoning about the security properties found in cryptographic protocols. The tool has been developed by Bruno Blanchet.

Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.

Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative and often recursive programs from incomplete specifications, such as input/output examples or constraints.

Amit Sahai is an American computer scientist. He is a professor of computer science at UCLA and the director of the Center for Encrypted Functionalities.

Gad Menahem Landau is an Israeli computer scientist noted for his contributions to combinatorial pattern matching and string algorithms and is the founding department chair of the Computer Science Department at the University of Haifa.

In computer software development, genetic Improvement is the use of optimisation and machine learning techniques, particularly search-based software engineering techniques such as genetic programming to improve existing software. The improved program need not behave identically to the original. For example, automatic bug fixing improves program code by reducing or eliminating buggy behaviour. In other cases the improved software should behave identically to the old version but is better because, for example: it runs faster, it uses less memory, it uses less energy or it runs on a different type of computer. GI differs from, for example, formal program translation, in that it primarily verifies the behaviour of the new mutant version by running both the new and the old software on test inputs and comparing their output and performance in order to see if the new software can still do what is wanted of the original program and is now better.

Automatic bug-fixing is the automatic repair of software bugs without the intervention of a human programmer. It is also commonly referred to as automatic patch generation, automatic bug repair, or automatic program repair. The typical goal of such techniques is to automatically generate correct patches to eliminate bugs in software programs without causing software regression.

<span class="mw-page-title-main">Hausi A. Muller</span> Canadian computer scientist and software engineer

Hausi A. Müller is a Canadian computer scientist and software engineer. He is a professor of computer science at the University of Victoria, British Columbia, Canada and a Fellow of the Canadian Academy of Engineering.

Transition refers to a computer science paradigm in the context of communication systems which describes the change of communication mechanisms, i.e., functions of a communication system, in particular, service and protocol components. In a transition, communication mechanisms within a system are replaced by functionally comparable mechanisms with the aim to ensure the highest possible quality, e.g., as captured by the quality of service.

Multitier programming is a programming paradigm for distributed software, which typically follows a multitier architecture, physically separating different functional aspects of the software into different tiers. Multitier programming allows functionalities that span multiple of such tiers to be developed in a single compilation unit using a single programming language. Without multitier programming, tiers are developed using different languages, e.g., JavaScript for the Web client, PHP for the Web server and SQL for the database. Multitier programming is often integrated into general-purpose languages by extending them with support for distribution.

Signal Transition Graphs (STGs) are typically used in electronic engineering and computer engineering to describe dynamic behaviour of asynchronous circuits, for the purposes of their analysis or synthesis.

References

↑ Deswarte, Y.; Kanoun, K.; Laprie, J.-C. (July 1998). "Diversity against accidental and deliberate faults". Proceedings Computer Security, Dependability, and Assurance: From Needs to Solutions (Cat. No.98EX358). IEEE Comput. Soc. pp. 171–181. CiteSeerX 10.1.1.27.9420 . doi:10.1109/csda.1998.798364. ISBN 978-0769503370. S2CID 5597924.
↑ Knight, John C. (2011), "Diversity", Dependable and Historic Computing, Lecture Notes in Computer Science, vol. 6875, Springer Berlin Heidelberg, pp. 298–312, doi:10.1007/978-3-642-24541-1_23, ISBN 9783642245404
↑ Just, James E.; Cornwell, Mark (2004-10-29). "Review and analysis of synthetic diversity for breaking monocultures". Proceedings of the 2004 ACM workshop on Rapid malcode. ACM. pp. 23–32. CiteSeerX 10.1.1.76.3691 . doi:10.1145/1029618.1029623. ISBN 978-1581139709. S2CID 358885.
↑ Schaefer, Ina; Rabiser, Rick; Clarke, Dave; Bettini, Lorenzo; Benavides, David; Botterweck, Goetz; Pathak, Animesh; Trujillo, Salvador; Villela, Karina (2012-07-28). "Software diversity: state of the art and perspectives". International Journal on Software Tools for Technology Transfer. 14 (5): 477–495. CiteSeerX 10.1.1.645.1960 . doi:10.1007/s10009-012-0253-y. ISSN 1433-2779. S2CID 7347285.
↑ "Protecting Applications with Automated Software Diversity". Galois, Inc. 2018-09-10. Retrieved 2019-02-12.
↑ Coppens, Bart; De Sutter, Bjorn; Volckaert, Stijn (2018-03-01), "Multi-variant execution environments", The Continuing Arms Race: Code-Reuse Attacks and Defenses, ACM, pp. 211–258, doi:10.1145/3129743.3129752, ISBN 9781970001839, S2CID 189007860
↑ Cohen, Frederick B. (1993). "Operating system protection through program evolution" (PDF). Computers & Security. 12 (6): 565–584. doi:10.1016/0167-4048(93)90054-9. ISSN 0167-4048.
↑ Chenxi Wang; Davidson, J.; Hill, J.; Knight, J. (2001). "Protection of software-based survivability mechanisms". Proceedings International Conference on Dependable Systems and Networks (PDF). IEEE Comput. Soc. pp. 193–202. CiteSeerX 10.1.1.1.7416 . doi:10.1109/dsn.2001.941405. ISBN 978-0769511016. S2CID 15860593. Archived (PDF) from the original on April 30, 2017.
↑ Forrest, S.; Somayaji, A.; Ackley, D.H. (1997). "Building diverse computer systems". Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133) (PDF). IEEE Comput. Soc. Press. pp. 67–72. CiteSeerX 10.1.1.131.3961 . doi:10.1109/hotos.1997.595185. ISBN 978-0818678349. S2CID 1332487.
↑ Schulte, Eric; Fry, Zachary P.; Fast, Ethan; Weimer, Westley; Forrest, Stephanie (2013-07-28). "Software mutational robustness" (PDF). Genetic Programming and Evolvable Machines. 15 (3): 281–312. arXiv: 1204.4224 . doi:10.1007/s10710-013-9195-8. ISSN 1389-2576. S2CID 11520214.
↑ "Automated Software Diversity: Sometimes More Isn't Merrier". Galois, Inc. 2018-09-10. Retrieved 2019-02-12.
↑ Gorbenko, Anatoliy; Kharchenko, Vyacheslav; Tarasyuk, Olga; Romanovsky, Alexander (2011), Using Diversity in Cloud-Based Deployment Environment to Avoid Intrusions, Lecture Notes in Computer Science, vol. 6968, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 145–155, doi:10.1007/978-3-642-24124-6_14, ISBN 978-3-642-24123-9

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Deswarte, Y.; Kanoun, K.; Laprie, J.-C. (July 1998). "Diversity against accidental and deliberate faults". Proceedings Computer Security, Dependability, and Assurance: From Needs to Solutions (Cat. No.98EX358). IEEE Comput. Soc. pp. 171–181. CiteSeerX 10.1.1.27.9420 . doi:10.1109/csda.1998.798364. ISBN 978-0769503370. S2CID 5597924.

[2] Knight, John C. (2011), "Diversity", Dependable and Historic Computing, Lecture Notes in Computer Science, vol. 6875, Springer Berlin Heidelberg, pp. 298–312, doi:10.1007/978-3-642-24541-1_23, ISBN 9783642245404

[3] Just, James E.; Cornwell, Mark (2004-10-29). "Review and analysis of synthetic diversity for breaking monocultures". Proceedings of the 2004 ACM workshop on Rapid malcode. ACM. pp. 23–32. CiteSeerX 10.1.1.76.3691 . doi:10.1145/1029618.1029623. ISBN 978-1581139709. S2CID 358885.

[4] Schaefer, Ina; Rabiser, Rick; Clarke, Dave; Bettini, Lorenzo; Benavides, David; Botterweck, Goetz; Pathak, Animesh; Trujillo, Salvador; Villela, Karina (2012-07-28). "Software diversity: state of the art and perspectives". International Journal on Software Tools for Technology Transfer. 14 (5): 477–495. CiteSeerX 10.1.1.645.1960 . doi:10.1007/s10009-012-0253-y. ISSN 1433-2779. S2CID 7347285.

[post_galois-5] "Protecting Applications with Automated Software Diversity". Galois, Inc. 2018-09-10. Retrieved 2019-02-12.

[6] Coppens, Bart; De Sutter, Bjorn; Volckaert, Stijn (2018-03-01), "Multi-variant execution environments", The Continuing Arms Race: Code-Reuse Attacks and Defenses, ACM, pp. 211–258, doi:10.1145/3129743.3129752, ISBN 9781970001839, S2CID 189007860

[7] Cohen, Frederick B. (1993). "Operating system protection through program evolution" (PDF). Computers & Security. 12 (6): 565–584. doi:10.1016/0167-4048(93)90054-9. ISSN 0167-4048.

[8] Chenxi Wang; Davidson, J.; Hill, J.; Knight, J. (2001). "Protection of software-based survivability mechanisms". Proceedings International Conference on Dependable Systems and Networks (PDF). IEEE Comput. Soc. pp. 193–202. CiteSeerX 10.1.1.1.7416 . doi:10.1109/dsn.2001.941405. ISBN 978-0769511016. S2CID 15860593. Archived (PDF) from the original on April 30, 2017.

[9] Forrest, S.; Somayaji, A.; Ackley, D.H. (1997). "Building diverse computer systems". Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133) (PDF). IEEE Comput. Soc. Press. pp. 67–72. CiteSeerX 10.1.1.131.3961 . doi:10.1109/hotos.1997.595185. ISBN 978-0818678349. S2CID 1332487.

[10] Schulte, Eric; Fry, Zachary P.; Fast, Ethan; Weimer, Westley; Forrest, Stephanie (2013-07-28). "Software mutational robustness" (PDF). Genetic Programming and Evolvable Machines. 15 (3): 281–312. arXiv: 1204.4224 . doi:10.1007/s10710-013-9195-8. ISSN 1389-2576. S2CID 11520214.

[11] "Automated Software Diversity: Sometimes More Isn't Merrier". Galois, Inc. 2018-09-10. Retrieved 2019-02-12.

[12] Gorbenko, Anatoliy; Kharchenko, Vyacheslav; Tarasyuk, Olga; Romanovsky, Alexander (2011), Using Diversity in Cloud-Based Deployment Environment to Avoid Intrusions, Lecture Notes in Computer Science, vol. 6968, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 145–155, doi:10.1007/978-3-642-24124-6_14, ISBN 978-3-642-24123-9

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]