NASA-TLX

Last updated

The NASA Task Load Index (NASA-TLX) is a widely used, [1] subjective, multidimensional assessment tool that rates perceived workload in order to assess a task, system, or team's effectiveness or other aspects of performance (task loading). It was developed by the Human Performance Group at NASA's Ames Research Center over a three-year development cycle that included more than 40 laboratory simulations. [2] [3] It has been cited in over 4,400 studies, [4] highlighting the influence the NASA-TLX has had in human factors research. It has been used in a variety of domains, including aviation, healthcare and other complex socio-technical domains. [1] It is a subjective self-reporting set of scores, and is not an objective measure of the Task Load that should be measured using objective metrics that examine the product of the speed and accuracy of users performing a task.

Contents

Scales

Paper-and-pencil version of the NASA-TLX rating scale NasaTLX.png
Paper-and-pencil version of the NASA-TLX rating scale

NASA-TLX originally consisted of two parts: the total workload is divided into six subjective subscales that are represented on a single page, serving as one part of the questionnaire:

There is a description for each of these subscales that the subject should read before rating. They are rated for each task within a 100-points range with 5-point steps. These ratings are then combined to the task load index. Providing descriptions for each measurement can be found to help participants answer accurately. [5] These descriptions are as follows:

Mental Demand
How much mental and perceptual activity was required? Was the task easy or demanding, simple or complex?
Physical Demand
How much physical activity was required? Was the task easy or demanding, slack or strenuous?
Temporal Demand
How much time pressure did you feel due to the pace at which the tasks or task elements occurred? Was the pace slow or rapid?
Own Performance
How successful were you in performing the task? How satisfied were you with your performance?
Effort
How hard did you have to work (mentally and physically) to accomplish your level of performance?
Frustration Level
How irritated, stressed, and annoyed versus content, relaxed, and complacent did you feel during the task?

Analysis

The second part of TLX intends to create an individual weighting of these subscales by letting the subjects compare them pairwise based on their perceived importance. This requires the user to choose which measurement is more relevant to workload. The number of times each is chosen is the weighted score. [6] This is multiplied by the scale score for each dimension and then divided by 15 to get a workload score from 0 to 100, the overall task load index. Many researchers eliminate these pairwise comparisons, though, and refer to the test as "Raw TLX" then. [7] There has been evidence evaluating and supporting this shortened version over the full one since it might increase experimental validity. [8]

When using the "raw TLX", individual subscales may be dropped if less relevant to the task. [1] [7]

Administration

The Official NASA-TLX can be administered using a paper and pencil version, or using the Official NASA TLX for Apple iOS App. [9] There are also numerous unofficial computerized implementations of the NASA TLX. These unofficial versions may collect Personally Identifiable Information (PII), which is a violation of NASA Human Subject Research Guidelines for the Collection of PII [10] as set down by the NASA Independent Review Board (IRB). [11]

If a participant is required to answer the TLX questions multiple times, they only need to answer the 15 pairwise comparisons once per task type. [3] If a participant's workload needs to be measured for intrinsically different tasks, then revisiting the pairwise comparisons may be required. In every case, the subject should answer all 6 subjective rating subscales. It is these successive ratings that are then scored using the original pairwise questions as weighting factors, that leads to an understanding of the overall workload change. [2]

While there are multiple ways to administer the NASA-TLX, some may change the results of the test. One study showed that a paper-and-pencil version led to less cognitive workload than processing the information on a computer screen. [12] However, other studies found that computer screen versions, as well as on wearables, can nonetheless stably capture relative changes in workload. [13] To overcome the delay in administrating the test, the Official NASA TLX Apple iOS App [9] can be used to capture both the pairwise question answers and a subjects subjective subscale input, as well as calculating the final weighted and unweighted results. A feature found in the Official NASA TLX App is a new computer interface response rating scale, termed a Subjective Analogue Equivalent Rating (SAER) scale, that provides the closest possible user experience to that found in the paper and pencil version of NASA TLX. No other computerized version of the NASA TLX has successfully implemented this critical element for properly capturing a user subjective input.[ citation needed ] This can be seen in many unofficial computerized (both web and software application) versions that use an anchored or locking scale. This defeats the subjective purpose of the original paper and pencil implementation of the NASA TLX.


See also

Related Research Articles

Psychological testing refers to the administration of psychological tests. Psychological tests are administered or scored by trained evaluators. A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. The science behind psychological testing is psychometrics.

Stress management consists of a wide spectrum of techniques and psychotherapies aimed at controlling a person's level of psychological stress, especially chronic stress, generally for the purpose of improving the function of everyday life. Stress produces numerous physical and mental symptoms which vary according to each individual's situational factors. These can include a decline in physical health, such as headaches, chest pain, fatigue, sleep problems, and depression. The process of stress management is a key factor that can lead to a happy and successful life in modern society. Stress management provides numerous ways to manage anxiety and maintain overall well-being.

In cognitive psychology, cognitive load refers to the amount of working memory resources used. However, it is essential to distinguish it from the actual construct of Cognitive Load (CL) or Mental Workload (MWL), which is studied widely in many disciplines. According to work conducted in the field of instructional design and pedagogy, broadly, there are three types of cognitive load: intrinsic cognitive load is the effort associated with a specific topic; extraneous cognitive load refers to the way information or tasks are presented to a learner; and germane cognitive load refers to the work put into creating a permanent store of knowledge. However, over the years, the additivity of these types of cognitive load has been investigated and questioned. Now it is believed that they circularly influence each other.

The term workload can refer to several different yet related entities.

Situational awareness or situation awareness (SA) is the understanding of an environment, its elements, and how it changes with respect to time or other factors. Situational awareness is important for effective decision making in many environments. It is formally defined as:

“the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future”.

Cognitive ergonomics is a scientific discipline that studies, evaluates, and designs tasks, jobs, products, environments and systems and how they interact with humans and their cognitive abilities. It is defined by the International Ergonomics Association as "concerned with mental processes, such as perception, memory, reasoning, and motor response, as they affect interactions among humans and other elements of a system. Cognitive ergonomics is responsible for how work is done in the mind, meaning, the quality of work is dependent on the persons understanding of situations. Situations could include the goals, means, and constraints of work. The relevant topics include mental workload, decision-making, skilled performance, human-computer interaction, human reliability, work stress and training as these may relate to human-system design." Cognitive ergonomics studies cognition in work and operational settings, in order to optimize human well-being and system performance. It is a subset of the larger field of human factors and ergonomics.

The Cooper-Harper Handling Qualities Rating Scale (HQRS), sometimes Cooper-Harper Rating Scale (CHRS), is a pilot rating scale, a set of criteria used by test pilots and flight test engineers to evaluate the handling qualities of aircraft while performing a task during a flight test. The scale ranges from 1 to 10, with 1 indicating the best handling characteristics and 10 the worst. The criteria are evaluative and thus the scale is subjective.

A task load indicates the degree of difficulty experienced when performing a task, and task loading describes the accumulation of tasks that are necessary to perform an operation. A light task loading can be managed by the operator with capacity to spare in case of contingencies. Task loads are primarily associated with underwater diving. They are also associated with workloads in other environments, such as aircraft cockpits and command and control stations.

User experience evaluation (UXE) or user experience assessment (UXA) refers to a collection of methods, skills and tools utilized to uncover how a person perceives a system before, during and after interacting with it. It is non-trivial to assess user experience since user experience is subjective, context-dependent and dynamic over time. For a UXA study to be successful, the researcher has to select the right dimensions, constructs, and methods and target the research for the specific area of interest such as game, transportation, mobile, etc.

In modern psychology, vigilance, also termed sustained concentration, is defined as the ability to maintain concentrated attention over prolonged periods of time. During this time, the person attempts to detect the appearance of a particular target stimulus. The individual watches for a signal stimulus that may occur at an unknown time.

The expertise reversal effect refers to the reversal of the effectiveness of instructional techniques on learners with differing levels of prior knowledge. The primary recommendation that stems from the expertise reversal effect is that instructional design methods need to be adjusted as learners acquire more knowledge in a specific domain. Expertise is described as "the ability to perform fluently in a specific class of tasks."

<span class="mw-page-title-main">P3b</span>

The P3b is a subcomponent of the P300, an event-related potential (ERP) component that can be observed in human scalp recordings of brain electrical activity. The P3b is a positive-going amplitude peaking at around 300 ms, though the peak will vary in latency from 250 to 500 ms or later depending upon the task and on the individual subject response. Amplitudes are typically highest on the scalp over parietal brain areas.

The Health Dynamics Inventory (HDI) is a 50 item self-report questionnaire developed to evaluate mental health functioning and change over time and treatment. The HDI was written to evaluate the three aspects of mental disorders as described in the Diagnostic and Statistical Manual of Mental Disorders (DSM): "clinically significant behavioral or psychological syndrome or pattern...associated with present distress...or disability". This also corresponds to the phase model described by Howard and colleagues Accordingly, the HDI assesses (1) the experience of emotional or behavioral symptoms that define mental illness, such as dysphoria, worry, angry outbursts, low self-esteem, or excessive drinking, (2) the level of emotional distress related to these symptoms, and (3) the impairment or problems fulfilling the major roles of one's life.

<span class="mw-page-title-main">Ergonomics</span> Designing systems to suit their users

Ergonomics, also known as human factors or human factors engineering (HFE), is the application of psychological and physiological principles to the engineering and design of products, processes, and systems. Primary goals of human factors engineering are to reduce human error, increase productivity and system availability, and enhance safety, health and comfort with a specific focus on the interaction between the human and equipment.

Neville Moray was a British-born Canadian psychologist. He served as an academic and professor at the Department of Psychology of the University of Surrey, known from his 1959 research of the cocktail party effect.

Human performance modeling (HPM) is a method of quantifying human behavior, cognition, and processes. It is a tool used by human factors researchers and practitioners for both the analysis of human function and for the development of systems designed for optimal user experience and interaction. It is a complementary approach to other usability testing methods for evaluating the impact of interface features on operator performance.

<span class="mw-page-title-main">Stress in the aviation industry</span> Pilots wellbeing whilst working

Stress in the aviation industry is a common phenomenon composed of three sources: physiological stressors, psychological stressors, and environmental stressors. Professional pilots can experience stress in flight, on the ground during work-related activities, and during personal time because of the influence of their occupation. An airline pilot can be an extremely stressful job due to the workload, responsibilities and safety of the thousands of passengers they transport around the world. Chronic levels of stress can negatively impact one's health, job performance and cognitive functioning. Being exposed to stress does not always negatively influence humans because it can motivate people to improve and help them adapt to a new environment. Unfortunate accidents start to occur when a pilot is under excessive stress, as it dramatically affects his or her physical, emotional, and mental conditions. Stress "jeopardizes decision-making relevance and cognitive functioning" and it is a prominent cause of pilot error. Being a pilot is considered a unique job that requires managing high workloads and good psychological and physical health. Unlike the other professional jobs, pilots are considered to be highly affected by stress levels. One study states that 70% of surgeons agreed that stress and fatigue don't impact their performance level, while only 26% of pilots denied that stress influences their performance. Pilots themselves realize how powerful stress can be, and yet many accidents and incidents continues to occur and have occurred, such as Asiana Airlines Flight 214, American Airlines Flight 1420, and Polish Air Force Tu-154.

The University of California, San Diego Performance-Based Skills Assessment (UPSA) was created by Dr. Thomas L. Patterson to provide a more reliable measure of every day functioning in patients with schizophrenia than the previously utilized methods such as self-report, clinician ratings or direct observation.

The Attribution Questionnaire (AQ) is a 27-item self-report assessment tool designed to measure public stigma towards people with mental illnesses. It assesses emotional reaction and discriminatory responses based on answers to a hypothetical vignette about a man with schizophrenia named Harry. There are several different versions of the vignette that test multiple forms of attribution. Responses assessing stigma towards Harry are in the form of 27 items rated on a Likert scale ranging from 1 (not at all) to 9 (very much). There are 9 subscales within the AQ that breakdown the responses one could have towards a person with mental illness into different categories. The AQ was created in 2003 by Dr. Patrick Corrigan and colleagues and has since been revised into smaller tests because of the complexity and hypothetical that did not capture children and adolescent's stigmas well. The later scales are the Attribution Questionnaire-9 (AQ-9), the revised Attribution Questionnaire (r-AQ), and the children's Attribution Questionnaire (AQ-8-C).

<span class="mw-page-title-main">Daniel Gopher</span> Israeli cognitive psychologist and ergonomist

Daniel Gopher is a professor (Emeritus) of Cognitive psychology and Human Factors Engineering at the Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology. He held the Yigal Alon Chair for the Study of Humans at Work at the Technion. Gopher is a fellow of the Human Factors and Ergonomics Society, the Psychonomic Society and the International Ergonomics Association.

References

  1. 1 2 3 Colligan, L; Potts, HWW; Finn, CT; Sinkin, RA (July 2015). "Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record". International Journal of Medical Informatics. 84 (7): 469–476. doi:10.1016/j.ijmedinf.2015.03.003. PMID   25868807.(subscription required)
  2. 1 2 NASA (1986). Nasa Task Load Index (TLX) v. 1.0 Manual
  3. 1 2 Hart, Sandra G.; Staveland, Lowell E. (1988). "Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research" (PDF). In Hancock, Peter A.; Meshkati, Najmedin (eds.). Human Mental Workload. Advances in Psychology. Vol. 52. Amsterdam: North Holland. pp. 139–183. doi:10.1016/S0166-4115(08)62386-9. ISBN   978-0-444-70388-0. Archived from the original (PDF) on 2010-01-07. Retrieved 2010-08-10.
  4. External link to Google Scholar. .
  5. Schuff, D; Corral, K; Turetken, O (December 2011). "Comparing the understandability of alternative data warehouse schemas: An empirical study". Decision Support Systems. 52 (1): 9–20. doi:10.1016/j.dss.2011.04.003.(subscription required)
  6. Rubio, S; Diaz, E; Martin, J; Puente, JM (January 2004). "Evaluation of subjective mental workload: A comparison of SWAT, NASA-TLX, and workload profile methods". Applied Psychology. 53 (1): 61–86. doi:10.1111/j.1464-0597.2004.00161.x.(subscription required)
  7. 1 2 Hart, Sandra G. (October 2006). "NASA-Task Load Index (NASA-TLX); 20 Years Later" (PDF). Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 50 (9): 904–908. doi:10.1177/154193120605000909. S2CID   6292200.
  8. Bustamante, EA; Spain, RD (September 2008). "Measurement Invariance of the NASA TLX". Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 52 (19): 1522–1526. doi:10.1177/154193120805201946. S2CID   143607921.(subscription required)
  9. 1 2 Division, Human Systems Integration. "Human Systems Integration Division @ NASA Ames - Outreach & Publications". hsi.arc.nasa.gov. Retrieved 2018-05-09.
  10. "NPD 1382.17J - main". nodis3.gsfc.nasa.gov. Retrieved 2018-05-09.
  11. MD, Kathleen McMonigal. "IRB - Conducting Research - Working with Other NASA Centers and Other Institutions". irb.nasa.gov. Retrieved 2018-05-09.
  12. Noyes, JM; Bruneau, DPJ (April 2007). "A self-analysis of the NASA-TLX workload measure". Ergonomics. 50 (4): 514–519. doi:10.1080/00140130701235232. PMID   17575712. S2CID   26806345.(subscription required)
  13. Mach, Sebastian; Gründling, Jan P.; Schmalfuß, Franziska; Krems, Josef F. (2019). "How to Assess Mental Workload Quick and Easy at Work: A Method Comparison". Proceedings of the 20th Congress of the International Ergonomics Association (IEA 2018). Advances in Intelligent Systems and Computing. Vol. 825. pp. 978–984. doi:10.1007/978-3-319-96068-5_106. ISBN   978-3-319-96067-8. S2CID   58681265.