NASA-TLX

Last updated

The NASA Task Load Index (NASA-TLX) is a widely used, [1] subjective, multidimensional assessment tool that rates perceived workload in order to assess a task, system, or team's effectiveness or other aspects of performance. It was developed by the Human Performance Group at NASA's Ames Research Center over a three-year development cycle that included more than 40 laboratory simulations. [2] [3] It has been cited in over 4,400 studies, [4] highlighting the influence the NASA-TLX has had in human factors research. It has been used in a variety of domains, including aviation, healthcare and other complex socio-technical domains. [1] It is a subjective self-reporting set of scores, and is not an objective measure of the Task Load that should be measured using objective metrics that examine the product of the speed and accuracy of users performing a task.

Contents

Scales

Paper-and-pencil version of the NASA-TLX rating scale NasaTLX.png
Paper-and-pencil version of the NASA-TLX rating scale

NASA-TLX originally consisted of two parts: the total workload is divided into six subjective subscales that are represented on a single page, serving as one part of the questionnaire:

There is a description for each of these subscales that the subject should read before rating. They are rated for each task within a 100-points range with 5-point steps. These ratings are then combined to the task load index. Providing descriptions for each measurement can be found to help participants answer accurately. [5] These descriptions are as follows:

Mental Demand
How much mental and perceptual activity was required? Was the task easy or demanding, simple or complex?
Physical Demand
How much physical activity was required? Was the task easy or demanding, slack or strenuous?
Temporal Demand
How much time pressure did you feel due to the pace at which the tasks or task elements occurred? Was the pace slow or rapid?
Own Performance
How successful were you in performing the task? How satisfied were you with your performance?
Effort
How hard did you have to work (mentally and physically) to accomplish your level of performance?
Frustration Level
How irritated, stressed, and annoyed versus content, relaxed, and complacent did you feel during the task?

Analysis

The second part of TLX intends to create an individual weighting of these subscales by letting the subjects compare them pairwise based on their perceived importance. This requires the user to choose which measurement is more relevant to workload. The number of times each is chosen is the weighted score. [6] This is multiplied by the scale score for each dimension and then divided by 15 to get a workload score from 0 to 100, the overall task load index. Many researchers eliminate these pairwise comparisons, though, and refer to the test as "Raw TLX" then. [7] There has been evidence evaluating and supporting this shortened version over the full one since it might increase experimental validity. [8]

When using the "raw TLX", individual subscales may be dropped if less relevant to the task. [1] [7]

Administration

The Official NASA-TLX can be administered using a paper and pencil version, or using the Official NASA TLX for Apple iOS App. [9] There are also numerous unofficial computerized implementations of the NASA TLX. These unofficial versions may collect Personally Identifiable Information (PII), which is a violation of NASA Human Subject Research Guidelines for the Collection of PII [10] as set down by the NASA Independent Review Board (IRB). [11]

If a participant is required to answer the TLX questions multiple times, they only need to answer the 15 pairwise comparisons once per task type. [3] If a participant's workload needs to be measured for intrinsically different tasks, then revisiting the pairwise comparisons may be required. In every case, the subject should answer all 6 subjective rating subscales. It is these successive ratings that are then scored using the original pairwise questions as weighting factors, that leads to an understanding of the overall workload change. [2]

While there are multiple ways to administer the NASA-TLX, some may change the results of the test. One study showed that a paper-and-pencil version led to less cognitive workload than processing the information on a computer screen. [12] However, other studies found that computer screen versions, as well as on wearables, can nonetheless stably capture relative changes in workload. [13] To overcome the delay in administrating the test, the Official NASA TLX Apple iOS App [9] can be used to capture both the pairwise question answers and a subjects subjective subscale input, as well as calculating the final weighted and unweighted results. A feature found in the Official NASA TLX App is a new computer interface response rating scale, termed a Subjective Analogue Equivalent Rating (SAER) scale, that provides the closest possible user experience to that found in the paper and pencil version of NASA TLX. No other computerized version of the NASA TLX has successfully implemented this critical element for properly capturing a user subjective input.[ citation needed ] This can be seen in many unofficial computerized (both web and software application) versions that use an anchored or locking scale. This defeats the subjective purpose of the original paper and pencil implementation of the NASA TLX.


See also

Related Research Articles

Stress management is a wide spectrum of techniques and psychotherapies aimed at controlling a person's level of stress, especially chronic stress, usually for the purpose of improving everyday functioning. Stress produces numerous physical and mental symptoms which vary according to each individual's situational factors. These can include a decline in physical health, such as headaches, chest pain, fatigue, and sleep problems, as well as depression. The process of stress management is named as one of the keys to a happy and successful life in modern society. Life often delivers numerous demands that can be difficult to handle, but stress management provides a number of ways to manage anxiety and maintain overall well-being.

<span class="mw-page-title-main">Neuropsychological test</span> Assess neurological function associated with certain behaviors and brain damage

Neuropsychological tests are specifically designed tasks that are used to measure a psychological function known to be linked to a particular brain structure or pathway. Tests are used for research into brain function and in a clinical setting for the diagnosis of deficits. They usually involve the systematic administration of clearly defined procedures in a formal environment. Neuropsychological tests are typically administered to a single person working with an examiner in a quiet office environment, free from distractions. As such, it can be argued that neuropsychological tests at times offer an estimate of a person's peak level of cognitive performance. Neuropsychological tests are a core component of the process of conducting neuropsychological assessment, along with personal, interpersonal and contextual factors.

In cognitive psychology, cognitive load refers to the amount of working memory resources used. However, it is essential to distinguish it from the actual construct of Cognitive Load (CL) or Mental Workload (MWL), which is studied widely in many disciplines. According to work conducted in the field of instructional design and pedagogy, broadly, there are three types of cognitive load: intrinsic cognitive load is the effort associated with a specific topic; extraneous cognitive load refers to the way information or tasks are presented to a learner; and germane cognitive load refers to the work put into creating a permanent store of knowledge. However, over the years, the additivity of these types of cognitive load has been investigated and questioned. Now it is believed that they circularly influence each other.

The term workload can refer to several different yet related entities.

Inattentional blindness or perceptual blindness occurs when an individual fails to perceive an unexpected stimulus in plain sight, purely as a result of a lack of attention rather than any vision defects or deficits. When it becomes impossible to attend to all the stimuli in a given situation, a temporary "blindness" effect can occur, as individuals fail to see unexpected but often salient objects or stimuli.

Situational awareness or situation awareness (SA) is the understanding of an environment, its elements, and how it changes with respect to time or other factors. Situational awareness is important for effective decision making in many environments. It is formally defined as:

“the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future”.

Cognitive ergonomics is a scientific discipline that studies, evaluates, and designs tasks, jobs, products, environments and systems and how they interact with humans and their cognitive abilities. It is defined by the International Ergonomics Association as "concerned with mental processes, such as perception, memory, reasoning, and motor response, as they affect interactions among humans and other elements of a system. Cognitive ergonomics is responsible for how work is done in the mind, meaning, the quality of work is dependent on the persons understanding of situations. Situations could include the goals, means, and constraints of work. The relevant topics include mental workload, decision-making, skilled performance, human-computer interaction, human reliability, work stress and training as these may relate to human-system design." Cognitive ergonomics studies cognition in work and operational settings, in order to optimize human well-being and system performance. It is a subset of the larger field of human factors and ergonomics.

The Cooper-Harper Handling Qualities Rating Scale (HQRS), sometimes Cooper-Harper Rating Scale (CHRS), is a pilot rating scale, a set of criteria used by test pilots and flight test engineers to evaluate the handling qualities of aircraft while performing a task during a flight test. The scale ranges from 1 to 10, with 1 indicating the best handling characteristics and 10 the worst. The criteria are evaluative and thus the scale is subjective.

In underwater diving, task load indicates the degree of difficulty experienced when performing a task, and task loading describes the accumulation of tasks that are necessary to perform an operation. A light task loading can be managed by the operator with capacity to spare in case of contingencies.

User experience evaluation (UXE) or user experience assessment (UXA) refers to a collection of methods, skills and tools utilized to uncover how a person perceives a system before, during and after interacting with it. It is non-trivial to assess user experience since user experience is subjective, context-dependent and dynamic over time. For a UXA study to be successful, the researcher has to select the right dimensions, constructs, and methods and target the research for the specific area of interest such as game, transportation, mobile, etc.

In modern psychology, vigilance, also termed sustained concentration, is defined as the ability to maintain concentrated attention over prolonged periods of time. During this time, the person attempts to detect the appearance of a particular target stimulus. The individual watches for a signal stimulus that may occur at an unknown time.

The expertise reversal effect refers to the reversal of the effectiveness of instructional techniques on learners with differing levels of prior knowledge. The primary recommendation that stems from the expertise reversal effect is that instructional design methods need to be adjusted as learners acquire more knowledge in a specific domain. Expertise is described as "the ability to perform fluently in a specific class of tasks."

<span class="mw-page-title-main">P3b</span>

The P3b is a subcomponent of the P300, an event-related potential (ERP) component that can be observed in human scalp recordings of brain electrical activity. The P3b is a positive-going amplitude peaking at around 300 ms, though the peak will vary in latency from 250 to 500 ms or later depending upon the task and on the individual subject response. Amplitudes are typically highest on the scalp over parietal brain areas.

The Health Dynamics Inventory (HDI) is a 50 item self-report questionnaire developed to evaluate mental health functioning and change over time and treatment. The HDI was written to evaluate the three aspects of mental disorders as described in the Diagnostic and Statistical Manual of Mental Disorders (DSM): "clinically significant behavioral or psychological syndrome or pattern...associated with present distress...or disability". This also corresponds to the phase model described by Howard and colleagues Accordingly, the HDI assesses (1) the experience of emotional or behavioral symptoms that define mental illness, such as dysphoria, worry, angry outbursts, low self-esteem, or excessive drinking, (2) the level of emotional distress related to these symptoms, and (3) the impairment or problems fulfilling the major roles of one's life.

<span class="mw-page-title-main">Human factors and ergonomics</span> Designing systems to suit their users

Human factors and ergonomics is the application of psychological and physiological principles to the engineering and design of products, processes, and systems. Primary goals of human factors engineering are to reduce human error, increase productivity and system availability, and enhance safety, health and comfort with a specific focus on the interaction between the human and equipment.

Neville Moray was a British-born Canadian psychologist. He served as an academic and professor at the Department of Psychology of the University of Surrey, known from his 1959 research of the cocktail party effect.

Human performance modeling (HPM) is a method of quantifying human behavior, cognition, and processes. It is a tool used by human factors researchers and practitioners for both the analysis of human function and for the development of systems designed for optimal user experience and interaction. It is a complementary approach to other usability testing methods for evaluating the impact of interface features on operator performance.

<span class="mw-page-title-main">Stress in the aviation industry</span> Pilots wellbeing whilst working

Stress in the aviation industry is a common phenomenon composed of three sources: physiological stressors, psychological stressors, and environmental stressors. Professional pilots can experience stress in flight, on the ground during work-related activities, and during personal time because of the influence of their occupation. An airline pilot can be an extremely stressful job due to the workload, responsibilities and safety of the thousands of passengers they transport around the world. Chronic levels of stress can negatively impact one's health, job performance and cognitive functioning. Being exposed to stress does not always negatively influence humans because it can motivate people to improve and help them adapt to a new environment. Unfortunate accidents start to occur when a pilot is under excessive stress, as it dramatically affects his or her physical, emotional, and mental conditions. Stress "jeopardizes decision-making relevance and cognitive functioning" and it is a prominent cause of pilot error. Being a pilot is considered a unique job that requires managing high workloads and good psychological and physical health. Unlike the other professional jobs, pilots are considered to be highly affected by stress levels. One study states that 70% of surgeons agreed that stress and fatigue don't impact their performance level, while only 26% of pilots denied that stress influences their performance. Pilots themselves realize how powerful stress can be, and yet many accidents and incidents continues to occur and have occurred, such as Asiana Airlines Flight 214, American Airlines Flight 1420, and Polish Air Force Tu-154.

The University of California, San Diego Performance-Based Skills Assessment (UPSA) was created by Dr. Thomas L. Patterson to provide a more reliable measure of every day functioning in patients with schizophrenia than the previously utilized methods such as self-report, clinician ratings or direct observation.

<span class="mw-page-title-main">Daniel Gopher</span> Israeli cognitive psychologist and ergonomist

Daniel Gopher is a professor (Emeritus) of Cognitive psychology and Human Factors Engineering at the Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology. He held the Yigal Alon Chair for the Study of Humans at Work at the Technion. Gopher is a fellow of the Human Factors and Ergonomics Society, the Psychonomic Society and the International Ergonomics Association.

References

  1. 1 2 3 Colligan, L; Potts, HWW; Finn, CT; Sinkin, RA (July 2015). "Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record". International Journal of Medical Informatics. 84 (7): 469–476. doi:10.1016/j.ijmedinf.2015.03.003. PMID   25868807.(subscription required)
  2. 1 2 NASA (1986). Nasa Task Load Index (TLX) v. 1.0 Manual
  3. 1 2 Hart, Sandra G.; Staveland, Lowell E. (1988). "Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research" (PDF). In Hancock, Peter A.; Meshkati, Najmedin (eds.). Human Mental Workload. Advances in Psychology. Vol. 52. Amsterdam: North Holland. pp. 139–183. doi:10.1016/S0166-4115(08)62386-9. ISBN   978-0-444-70388-0.
  4. External link to Google Scholar. .
  5. Schuff, D; Corral, K; Turetken, O (December 2011). "Comparing the understandability of alternative data warehouse schemas: An empirical study". Decision Support Systems. 52 (1): 9–20. doi:10.1016/j.dss.2011.04.003.(subscription required)
  6. Rubio, S; Diaz, E; Martin, J; Puente, JM (January 2004). "Evaluation of subjective mental workload: A comparison of SWAT, NASA-TLX, and workload profile methods". Applied Psychology. 53 (1): 61–86. doi:10.1111/j.1464-0597.2004.00161.x.(subscription required)
  7. 1 2 Hart, Sandra G. (October 2006). "NASA-Task Load Index (NASA-TLX); 20 Years Later" (PDF). Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 50 (9): 904–908. doi:10.1177/154193120605000909. S2CID   6292200.
  8. Bustamante, EA; Spain, RD (September 2008). "Measurement Invariance of the NASA TLX". Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 52 (19): 1522–1526. doi:10.1177/154193120805201946. S2CID   143607921.(subscription required)
  9. 1 2 Division, Human Systems Integration. "Human Systems Integration Division @ NASA Ames - Outreach & Publications". hsi.arc.nasa.gov. Retrieved 2018-05-09.
  10. "NPD 1382.17J - main". nodis3.gsfc.nasa.gov. Retrieved 2018-05-09.
  11. MD, Kathleen McMonigal. "IRB - Conducting Research - Working with Other NASA Centers and Other Institutions". irb.nasa.gov. Retrieved 2018-05-09.
  12. Noyes, JM; Bruneau, DPJ (April 2007). "A self-analysis of the NASA-TLX workload measure". Ergonomics. 50 (4): 514–519. doi:10.1080/00140130701235232. PMID   17575712. S2CID   26806345.(subscription required)
  13. Mach, Sebastian; Gründling, Jan P.; Schmalfuß, Franziska; Krems, Josef F. (2019). "How to Assess Mental Workload Quick and Easy at Work: A Method Comparison". Proceedings of the 20th Congress of the International Ergonomics Association (IEA 2018). Advances in Intelligent Systems and Computing. 825: 978–984. doi:10.1007/978-3-319-96068-5_106. ISBN   978-3-319-96067-8. S2CID   58681265.