System usability scale

Last updated August 20, 2024

In systems engineering, the system usability scale (SUS) is a simple, ten-item attitude Likert scale giving a global view of subjective assessments of usability. It was developed by John Brooke^[1] at Digital Equipment Corporation in the UK in 1986 as a tool to be used in usability engineering of electronic office systems.

The usability of a system, as defined by the ISO standard ISO 9241 Part 11, can be measured only by taking into account the context of use of the system—i.e., who is using the system, what they are using it for, and the environment in which they are using it. Furthermore, measurements of usability have several different aspects:

effectiveness (can users successfully achieve their objectives)
efficiency (how much effort and resource is expended in achieving those objectives)
satisfaction (was the experience satisfactory)

Measures of effectiveness and efficiency are also context specific. Effectiveness in using a system for controlling a continuous industrial process would generally be measured in very different terms to, say, effectiveness in using a text editor. Thus, it can be difficult, if not impossible, to answer the question "is system A more usable than system B", because the measures of effectiveness and efficiency may be very different. However, it can be argued that given a sufficiently high-level definition of subjective assessments of usability, comparisons can be made between systems.

The formula for computing the final SUS score requires converting the raw scores, by subtracting 1 from each raw score, then utilizing the following equation^[2]: $SUS=2.5\left(20+\sum ({\text{SUS01}},{\text{SUS03}},{\text{SUS05}},{\text{SUS07}},{\text{SUS09}})-\sum ({\text{SUS02}},{\text{SUS04}},{\text{SUS06}},{\text{SUS08}},{\text{SUS10}})\right)$

SUS has generally been seen as providing this type of high-level subjective view of usability and is thus often used in carrying out comparisons of usability between systems. Because it yields a single score on a scale of 0–100, it can be used to compare even systems that are outwardly dissimilar. This one-dimensional aspect of the SUS is both a benefit and a drawback, because the questionnaire is necessarily quite general.

Recently, Lewis and Sauro^[3] suggested a two-factor orthogonal structure, which practitioners may use to score the SUS on independent Usability and Learnability dimensions. At the same time, Borsci, Federici and Lauriola^[4] by an independent analysis confirm the two factors structure of SUS, also showing that those factors (Usability and Learnability) are correlated.

The SUS has been widely used in the evaluation of a range of systems. Bangor, Kortum and Miller^[5] have used the scale extensively over a ten-year period and have produced normative data that allow SUS ratings to be positioned relative to other systems. They propose an extension to SUS to provide an adjective rating that correlates with a given score. Based on a review of hundreds of usability studies, Sauro and Lewis^[6] proposed a curved grading scale for mean SUS scores.

Related Research Articles

Usability can be described as the capacity of a system to provide a condition for its users to perform the tasks safely, effectively, and efficiently while enjoying the experience. In software engineering, usability is the degree to which a software can be used by specified consumers to achieve quantified objectives with effectiveness, efficiency, and satisfaction in a quantified context of use.

A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

ISO 9241 is a multi-part standard from the International Organization for Standardization (ISO) covering ergonomics of human-system interaction and related, human-centered design processes. It is managed by the ISO Technical Committee 159. It was originally titled Ergonomic requirements for office work with visual display terminals (VDTs). From 2006 onwards, the standards were retitled to the more generic Ergonomics of Human System Interaction.

ISO/IEC 9126Software engineering — Product quality was an international standard for the evaluation of software quality. It has been replaced by ISO/IEC 25010:2011.

User experience (UX) is how a user interacts with and experiences a product, system or service. It includes a person's perceptions of utility, ease of use, and efficiency. Improving user experience is important to most companies, designers, and creators when creating and refining products because negative user experience can diminish the use of the product and, therefore, any desired positive impacts. Conversely, designing toward profitability as a main objective often conflicts with ethical user experience objectives and even causes harm. User experience is subjective. However, the attributes that make up the user experience are objective.

User interface (UI) design or user interface engineering is the design of user interfaces for machines and software, such as computers, home appliances, mobile devices, and other electronic devices, with the focus on maximizing usability and the user experience. In computer or software design, user interface (UI) design primarily focuses on information architecture. It is the process of building interfaces that clearly communicate to the user what's important. UI design refers to graphical user interfaces and other forms of interface design. The goal of user interface design is to make the user's interaction as simple and efficient as possible, in terms of accomplishing user goals.

A pain scale measures a patient's pain intensity or other features. Pain scales are a common communication tool in medical contexts, and are used in a variety of medical settings. Pain scales are a necessity to assist with better assessment of pain and patient screening. Pain measurements help determine the severity, type, and duration of the pain, and are used to make an accurate diagnosis, determine a treatment plan, and evaluate the effectiveness of treatment. Pain scales are based on trust, cartoons (behavioral), or imaginary data, and are available for neonates, infants, children, adolescents, adults, seniors, and persons whose communication is impaired. Pain assessments are often regarded as "the 5th vital sign".

The Vividness of Visual Imagery Questionnaire (VVIQ) was developed in 1973 by the British psychologist David Marks. The VVIQ consists of 16 items in four groups of 4 items in which the participant is invited to consider the mental image formed in thinking about specific scenes and situations. The vividness of the image is rated along a 5-point scale. The questionnaire has been widely used as a measure of individual differences in vividness of visual imagery. The large body of evidence confirms that the VVIQ is a valid and reliable psychometric measure of visual image vividness.

User experience evaluation (UXE) or user experience assessment (UXA) refers to a collection of methods, skills and tools utilized to uncover how a person perceives a system before, during and after interacting with it. It is non-trivial to assess user experience since user experience is subjective, context-dependent and dynamic over time. For a UXA study to be successful, the researcher has to select the right dimensions, constructs, and methods and target the research for the specific area of interest such as game, transportation, mobile, etc.

The NASA Task Load Index (NASA-TLX) is a widely used, subjective, multidimensional assessment tool that rates perceived workload in order to assess a task, system, or team's effectiveness or other aspects of performance. It was developed by the Human Performance Group at NASA's Ames Research Center over a three-year development cycle that included more than 40 laboratory simulations. It has been cited in over 4,400 studies, highlighting the influence the NASA-TLX has had in human factors research. It has been used in a variety of domains, including aviation, healthcare and other complex socio-technical domains. It is a subjective self-reporting set of scores, and is not an objective measure of the Task Load that should be measured using objective metrics that examine the product of the speed and accuracy of users performing a task.

The Health Dynamics Inventory (HDI) is a 50 item self-report questionnaire developed to evaluate mental health functioning and change over time and treatment. The HDI was written to evaluate the three aspects of mental disorders as described in the Diagnostic and Statistical Manual of Mental Disorders (DSM): "clinically significant behavioral or psychological syndrome or pattern...associated with present distress...or disability". This also corresponds to the phase model described by Howard and colleagues Accordingly, the HDI assesses (1) the experience of emotional or behavioral symptoms that define mental illness, such as dysphoria, worry, angry outbursts, low self-esteem, or excessive drinking, (2) the level of emotional distress related to these symptoms, and (3) the impairment or problems fulfilling the major roles of one's life.

Component-based usability testing (CBUT) is a testing approach which aims at empirically testing the usability of an interaction component. The latter is defined as an elementary unit of an interactive system, on which behavior-based evaluation is possible. For this, a component needs to have an independent, and by the user perceivable and controllable state, such as a radio button, a slider or a whole word processor application. The CBUT approach can be regarded as part of component-based software engineering branch of software engineering.

The Questionnaire For User Interaction Satisfaction (QUIS) is a tool developed to assess users' subjective satisfaction with specific aspects of the human-computer interface. It was developed in 1987 by a multi-disciplinary team of researchers at the University of Maryland Human–Computer Interaction Lab. The QUIS is currently at Version 7.0 with demographic questionnaire, a measure of overall system satisfaction along 6 scales, and measures of 9 specific interface factors. These 9 factors are: screen factors, terminology and system feedback, learning factors, system capabilities, technical manuals, on-line tutorials, multimedia, teleconferencing, and software installation. Currently available in: German, Italian, Portuguese, and Spanish.

Humor styles are a subject of research in the field of personality psychology that focuses on the ways in which individuals differ in their use of humor. People of all ages and cultures respond to humor, but their use of it can vary greatly. There are multiple factors, such as culture, age, and political orientation, that play a role in determining what people find humorous. Although humor styles can be somewhat variable depending on social context, they tend to be a relatively stable personality characteristic among individuals. Humor can play an instrumental role in the formation of social bonds, enabling people to relate to peers or to attract a mate, and can help to release tension during periods of stress. There is a lack of current, reliable research that explores the impact of humor usages on others because it is difficult to distinguish a healthy humor usage from one that is unhealthy. Justifications for harmful versus benign humor styles are subjective and lead to varying definitions of either usage.

The International Personality Item Pool (IPIP) is a public domain collection of items for use in personality tests. It is managed by the Oregon Research Institute.

Tools, devices or software must be evaluated before their release on the market from different points of view such as their technical properties or their usability. Usability evaluation allows assessing whether the product under evaluation is efficient enough, effective enough and sufficiently satisfactory for the users. For this assessment to be objective, there is a need for measurable goals that the system must achieve. That kind of goal is called a usability goal. They are objective criteria against which the results of the usability evaluation are compared to assess the usability of the product under evaluation.

Fear of negative evaluation (FNE) or fear of failure, also known as atychiphobia, is a psychological construct reflecting "apprehension about others' evaluations, distress over negative evaluations by others, and the expectation that others would evaluate one negatively". The construct and a psychological test to measure it were defined by David Watson and Ronald Friend in 1969. FNE is related to specific personality dimensions, such as anxiousness, submissiveness, and social avoidance. People who score high on the FNE scale are highly concerned with seeking social approval or avoiding disapproval by others and may tend to avoid situations where they have to undergo evaluations. High FNE subjects are also more responsive to situational factors. This has been associated with conformity, pro-social behavior, and social anxiety.

The six-factor model of psychological well-being is a theory developed by Carol Ryff that determines six factors that contribute to an individual's psychological well-being, contentment, and happiness. Psychological well-being consists of self-acceptance, positive relationships with others, autonomy, environmental mastery, a feeling of purpose and meaning in life, and personal growth and development. Psychological well-being is attained by achieving a state of balance affected by both challenging and rewarding life events.

The Pittsburgh Sleep Quality Index (PSQI) is a self-report questionnaire that assesses sleep quality over a 1-month time interval. The measure consists of 19 individual items, creating 7 components that produce one global score, and takes 5–10 minutes to complete. Developed by researchers at the University of Pittsburgh, the PSQI is intended to be a standardized sleep questionnaire for clinicians and researchers to use with ease and is used for multiple populations. The questionnaire has been used in many settings, including research and clinical activities, and has been used in the diagnosis of sleep disorders. Clinical studies have found the PSQI to be reliable and valid in the assessment of sleep problems to some degree, but more so with self-reported sleep problems and depression-related symptoms than actigraphic measures.

Evaluation measures for an information retrieval (IR) system assess how well an index, search engine, or database returns results from a collection of resources that satisfy a user's query. They are therefore fundamental to the success of information systems and digital platforms.

References

↑ Brooke, John (1996). "SUS: a "quick and dirty" usability scale". In P. W. Jordan; B. Thomas; B. A. Weerdmeester; A. L. McClelland (eds.). Usability Evaluation in Industry. London: Taylor and Francis.
↑ Lewis, James R. (3 July 2018). "The System Usability Scale: Past, Present, and Future". International Journal of Human–Computer Interaction. 34 (7): 577–590. doi:10.1080/10447318.2018.1455307. ISSN 1044-7318.
↑ Lewis, J.R. & Sauro, J. (2009). The factor structure of the system usability scale (PDF). San Diego, California: International conference (HCII 2009).
↑ Borsci, Simone; Federici, Stefano; Lauriola, Marco (2009). "On the dimensionality of the System Usability Scale: a test of alternative measurement models". Cognitive Processing. 10 (3): 193–197. doi:10.1007/s10339-009-0268-9. PMID 19565283. S2CID 1330990.
↑ Bangor, Aaron; Kortum, Philip T.; Miller, James T. (2008). "An Empirical Evaluation of the System Usability Scale". International Journal of Human-Computer Interaction. 24 (6): 574–594. doi:10.1080/10447310802205776. S2CID 29843973.
↑ Sauro, J.; Lewis, J.R. (2012). Quantifying the user experience: Practical statistics for user research. Waltham, Massachusetts: Morgan Kaufmann. doi:10.1016/C2010-0-65192-3. ISBN 9780123849687.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Brooke, John (1996). "SUS: a "quick and dirty" usability scale". In P. W. Jordan; B. Thomas; B. A. Weerdmeester; A. L. McClelland (eds.). Usability Evaluation in Industry. London: Taylor and Francis.

[2] Lewis, James R. (3 July 2018). "The System Usability Scale: Past, Present, and Future". International Journal of Human–Computer Interaction. 34 (7): 577–590. doi:10.1080/10447318.2018.1455307. ISSN 1044-7318.

[3] Lewis, J.R. & Sauro, J. (2009). The factor structure of the system usability scale (PDF). San Diego, California: International conference (HCII 2009).

[4] Borsci, Simone; Federici, Stefano; Lauriola, Marco (2009). "On the dimensionality of the System Usability Scale: a test of alternative measurement models". Cognitive Processing. 10 (3): 193–197. doi:10.1007/s10339-009-0268-9. PMID 19565283. S2CID 1330990.

[5] Bangor, Aaron; Kortum, Philip T.; Miller, James T. (2008). "An Empirical Evaluation of the System Usability Scale". International Journal of Human-Computer Interaction. 24 (6): 574–594. doi:10.1080/10447310802205776. S2CID 29843973.

[6] Sauro, J.; Lewis, J.R. (2012). Quantifying the user experience: Practical statistics for user research. Waltham, Massachusetts: Morgan Kaufmann. doi:10.1016/C2010-0-65192-3. ISBN 9780123849687.

[1]

[2]

[3]

[4]

[5]

[6]

Strongly disagree						Strongly agree
		1	2	3	4	5
1.	I think that I would like to use this system frequently.	◯	◯	◯	✗	◯
2.	I found the system unnecessarily complex.	◯	◯	◯	◯	✗
3.	I thought the system was easy to use.	✗	◯	◯	◯	◯
4.	I think that I would need the support of a technical person to be able to use this system.	◯	◯	◯	◯	✗
5.	I found the various functions in this system were well integrated.	◯	◯	◯	✗	◯
6.	I thought there was too much inconsistency in this system.	◯	✗	◯	◯	◯
7.	I would imagine that most people would learn to use this system very quickly.	◯	◯	✗	◯	◯
8.	I found the system very cumbersome to use.	◯	◯	✗	◯	◯
9.	I felt very confident using the system.	◯	◯	✗	◯	◯
10.	I needed to learn a lot of things before I could get going with this system.	◯	◯	◯	✗	◯
Standard version of the system usability scale

v t e Engineering approaches
School	Empathic design Frugal innovation Kansei engineering Keep It Simple Stupid Minimalism Use-centered design User-centered design
Concepts	Ornament and Crime Overengineering Planned obsolescence Sustainability Theory of constraints Usability Value engineering

System usability scale

Contents

Related Research Articles

References

Further reading

External links