John K. Kruschke

Last updated
John K. Kruschke
Alma mater University of California at Berkeley
Known for
Scientific career
Fields
Institutions Indiana University Bloomington
Thesis A connectionist model of category learning  (1990)
Doctoral advisors Stephen E. Palmer
Robert Nosofsky
Website jkkweb.sitehost.iu.edu

John Kendall Kruschke is an American psychologist and statistician known for his work in connectionist models of human learning, [1] and in Bayesian statistical analysis. [2] He is Provost Professor Emeritus [3] [4] in the Department of Psychological and Brain Sciences at Indiana University Bloomington. He won the Troland Research Award from the National Academy of Sciences in 2002. [5]

Contents

Research

Bayesian statistical analysis

Dissemination

Kruschke's popular textbook, Doing Bayesian Data Analysis, [2] was notable for its accessibility and unique scaffolding of concepts. The first half of the book used the simplest type of data (i.e., dichotomous values) for presenting all the fundamental concepts of Bayesian analysis, including generalized Bayesian power analysis and sample-size planning. The second half of the book used the generalized linear model as a framework for explaining applications to a spectrum of other types of data.

Kruschke has written many tutorial articles about Bayesian data analysis, including an open-access article that explains Bayesian and frequentist concepts side-by-side. [6] There is an accompanying online app that interactively does frequentist and Bayesian analyses simultaneously. Kruschke gave a video-recorded plenary talk on this topic at the United States Conference on Teaching Statistics (USCOTS).

Bayesian analysis reporting guidelines

Bayesian data analyses are increasing in popularity but are still relatively novel in many fields, and guidelines for reporting Bayesian analyses are useful for researchers, reviewers, and students. Kruschke's open-access Bayesian analysis reporting guidelines (BARG) [7] provide a step-by-step list with explanation. For instance, the BARG recommend that if the analyst uses Bayesian hypothesis testing, then the report should include not only the Bayes factor but also the minimum prior model probability for the posterior model probability to exceed a decision criterion.

Assessing null values of parameters

Kruschke proposed a decision procedure for assessing null values of parameters, based on the uncertainty of the posterior estimate of the parameter. [8] This approach contrasts with Bayesian hypothesis testing as model comparison . [9]

Ordinal data

Liddell and Kruschke [10] showed that the common practice of treating ordinal data (such as subjective ratings) as if they were metric values can systematically lead to errors of interpretation, even inversions of means. The problems were addressed by treating ordinal data with ordinal models, in particular an ordered-probit model. Frequentist techniques can also use ordered-probit models, but the authors favored Bayesian techniques for their robustness.

Models of learning

An overview of Kruschke's models of attentional learning through 2010 is provided in reference. [11] That reference summarizes numerous findings from human learning that suggest attentional learning. That reference also summarizes a series of Kruschke's models of learning under a general framework.

Dimensionality in backpropagation networks

Back-propagation networks are a type of connectionist model, at the core of deep-learning neural networks. Kruschke's early work with back-propagation networks created algorithms for expanding or contracting the dimensionality of hidden layers in the network, thereby affecting how the network generalized from training cases to testing cases . [12] The algorithms also improved the speed of learning. [13]

Exemplar-based models and learned attention

The ALCOVE model of associative learning [1] used gradient descent on error, as in back-propagation networks, to learn what stimulus dimensions to attend to or to ignore. The ALCOVE model was derived from the generalized context model [14] of R. M. Nosofsky. These models mathematically represent stimuli in a multi-dimensional space based on human perceived dimensions (such as color, size, etc.), and assume that training examples are stored in memory as complete exemplars (that is, as combinations of values on the dimensions). The ALCOVE model is trained with input-output pairs and gradually associates exemplars with trained outputs while simultaneously shifting attention toward relevant dimensions and away from irrelevant dimensions.

An enhancement of the ALCOVE model, called RASHNL, provided a mathematically coherent mechanism for gradient descent with limited-capacity attention. [15] The RASHNL model assumed that attention is shifted rapidly when a stimulus is presented, while learning of attention across trials is more gradual.

These models were fitted to empirical data from numerous human learning experiments, and provided good accounts of relative difficulties of learning different types of associations, and of accuracies of individual stimuli during training and generalization. Those models can not explain all aspects of learning; for example, an additional mechanism was needed to account for the rapidity of human learning of reversal shift (i.e., what was "A" is now "B" and vice versa). [16]

The highlighting effect

When people learn to categorize combinations of discrete features successively across a training session, people will tend to learn about the distinctive features of the later-learned items instead of learning about their complete combination of features. This attention to distinctive features of later-learned items is called "the highlighting effect", and is derived from an earlier finding known as "the inverse base-rate effect". [17]

Kruschke conducted an extensive series of novel learning experiments with human participants, and developed two connectionist models to account for the findings. The ADIT model [18] learned to attend to distinctive features, and the EXIT model [19] used rapid shifts of attention on each trial. A canonical highlighting experiment and a review of findings was presented in reference. [20]

Hybrid representation models for rules or functions with exceptions

People can learn to classify stimuli according to rules such as "a container for liquids that is wider than it is tall is called a bowl", along with exceptions to the rule such as "unless it is this specific case that is called a mug". A series of experiments demonstrated that people tend to classify novel items, that are relatively close to an exceptional case, according to the rule more than would be predicted by exemplar-based models. To account for the data, Erickson and Kruschke developed hybrid models that shifted attention between rule-based representation and exemplar-based representation. [21] [22] [23]

People can also learn continuous relationships between variables, called functions, such as "a page's height is about 1.5 times its width". When people are trained with examples of functions that have exceptional cases, the data are accounted for by hybrid models that combine locally applicable functional rules. [24]

Bayesian models of learning

Kruschke also explored Bayesian models of human-learning results that were addressed by his connectionist models. The effects of sequential or successive learning (such as highlighting, mentioned above) can be especially challenging for Bayesian models, which typically assume order-independence. Instead of assuming that the entire learning system is globally Bayesian, Kruschke developed models in which layers of the system are locally Bayesian. [25] This "locally Bayesian learning" accounted for combinations of phenomena that are difficult for non-Bayesian learning models or for globally-Bayesian learning models.

Another advantage of Bayesian representations is that they inherently represent uncertainty of parameter values, unlike typical connectionist models that save only a single value for each parameter. The representation of uncertainty can be used to guide active learning in which the learner decides which cases would be most useful to learn about next. [26]

Career

Kruschke joined the faculty of the Department of Psychological and Brain Sciences at Indiana University Bloomington as a lecturer in 1989. He remained at IU until he retired as Provost Professor Emeritus in 2022.

Education

Kruschke attained a B.A. in mathematics, with High Distinction in General Scholarship, from the University of California at Berkeley in 1983. In 1990, he received a Ph.D. in Psychology also from U. C. Berkeley.

Kruschke attended the 1978 Summer Science Program at The Thacher School in Ojai CA, which focused on astrophysics and celestial mechanics. He attended the 1988 Connectionist Models Summer School [27] at Carnegie Mellon University.

Awards

Related Research Articles

Categorization is the ability and activity of recognizing shared features or similarities between the elements of the experience of the world, organizing and classifying experience by associating them to a more abstract group, on the basis of their traits, features, similarities or other criteria that are universal to the group. Categorization is considered one of the most fundamental cognitive abilities, and as such it is studied particularly by psychology and cognitive linguistics.

The Decay theory is a theory that proposes that memory fades due to the mere passage of time. Information is therefore less available for later retrieval as time passes and memory, as well as memory strength, wears away. When an individual learns something new, a neurochemical "memory trace" is created. However, over time this trace slowly disintegrates. Actively rehearsing information is believed to be a major factor counteracting this temporal decline. It is widely believed that neurons die off gradually as we age, yet some older memories can be stronger than most recent memories. Thus, decay theory mostly affects the short-term memory system, meaning that older memories are often more resistant to shocks or physical attacks on the brain. It is also thought that the passage of time alone cannot cause forgetting, and that decay theory must also take into account some processes that occur as more time passes.

<span class="mw-page-title-main">Testing effect</span> Memory effect in educational psychology

The testing effect suggests long-term memory is increased when part of the learning period is devoted to retrieving information from memory. It is different from the more general practice effect, defined in the APA Dictionary of Psychology as "any change or improvement that results from practice or repetition of task items or activities."

The Levels of Processing model, created by Fergus I. M. Craik and Robert S. Lockhart in 1972, describes memory recall of stimuli as a function of the depth of mental processing. Deeper levels of analysis produce more elaborate, longer-lasting, and stronger memory traces than shallow levels of analysis. Depth of processing falls on a shallow to deep continuum. Shallow processing leads to a fragile memory trace that is susceptible to rapid decay. Conversely, deep processing results in a more durable memory trace. There are three levels of processing in this model. Structural processing, or visual, is when we remember only the physical quality of the word E.g how the word is spelled and how letters look. Phonemic processing includes remembering the word by the way it sounds. E.G the word tall rhymes with fall. Lastly, we have semantic processing in which we encode the meaning of the word with another word that is similar of has similar meaning. Once the word is perceived, the brain allows for a deeper processing.

<span class="mw-page-title-main">Mathematical psychology</span> Mathematical modeling of psychological theories and phenomena

Mathematical psychology is an approach to psychological research that is based on mathematical modeling of perceptual, thought, cognitive and motor processes, and on the establishment of law-like rules that relate quantifiable stimulus characteristics with quantifiable behavior. The mathematical approach is used with the goal of deriving hypotheses that are more exact and thus yield stricter empirical validations. There are five major research areas in mathematical psychology: learning and memory, perception and psychophysics, choice and decision-making, language and thinking, and measurement and scaling.

In Bayesian statistics, a credible interval is an interval within which an unobserved parameter value falls with a particular probability. It is an interval in the domain of a posterior probability distribution or a predictive distribution. The generalisation to multivariate problems is the credible region.

In cognitive psychology, the word superiority effect (WSE) refers to the phenomenon that people have better recognition of letters presented within words as compared to isolated letters and to letters presented within nonword strings. Studies have also found a WSE when letter identification within words is compared to letter identification within pseudowords and pseudohomophones.

In psychology, memory inhibition is the ability not to remember irrelevant information. The scientific concept of memory inhibition should not be confused with everyday uses of the word "inhibition". Scientifically speaking, memory inhibition is a type of cognitive inhibition, which is the stopping or overriding of a mental process, in whole or in part, with or without intention.

Artificial grammar learning (AGL) is a paradigm of study within cognitive psychology and linguistics. Its goal is to investigate the processes that underlie human language learning by testing subjects' ability to learn a made-up grammar in a laboratory setting. It was developed to evaluate the processes of human language learning but has also been utilized to study implicit learning in a more general sense. The area of interest is typically the subjects' ability to detect patterns and statistical regularities during a training phase and then use their new knowledge of those patterns in a testing phase. The testing phase can either use the symbols or sounds used in the training phase or transfer the patterns to another set of symbols or sounds as surface structure.

<span class="mw-page-title-main">Albert Bregman</span> Canadian psychologist and academic (1936–2023)

Albert Stanley Bregman was a Canadian academic and researcher in experimental psychology, cognitive science, and Gestalt psychology, primarily in the perceptual organization of sound.

The Troland Research Awards are an annual prize given by the United States National Academy of Sciences to two researchers in recognition of psychological research on the relationship between consciousness and the physical world. The areas where these award funds are to be spent include but are not limited to areas of experimental psychology, the topics of sensation, perception, motivation, emotion, learning, memory, cognition, language, and action. The award preference is given to experimental work with a quantitative approach or experimental research seeking physiological explanations.

Representational momentum is a small, but reliable, error in our visual perception of moving objects. Representational moment was discovered and named by Jennifer Freyd and Ronald Finke. Instead of knowing the exact location of a moving object, viewers actually think it is a bit further along its trajectory as time goes forward. For example, people viewing an object moving from left to right that suddenly disappears will report they saw it a bit further to the right than where it actually vanished. While not a big error, it has been found in a variety of different events ranging from simple rotations to camera movement through a scene. The name "representational momentum" initially reflected the idea that the forward displacement was the result of the perceptual system having internalized, or evolved to include, basic principles of Newtonian physics, but it has come to mean forward displacements that continue a presented pattern along a variety of dimensions, not just position or orientation. As with many areas of cognitive psychology, theories can focus on bottom-up or top-down aspects of the task. Bottom-up theories of representational momentum highlight the role of eye movements and stimulus presentation, while top-down theories highlight the role of the observer's experience and expectations regarding the presented event.

Retrieval-induced forgetting (RIF) is a memory phenomenon where remembering causes forgetting of other information in memory. The phenomenon was first demonstrated in 1994, although the concept of RIF has been previously discussed in the context of retrieval inhibition.

Object-based attention refers to the relationship between an ‘object’ representation and a person’s visually stimulated, selective attention, as opposed to a relationship involving either a spatial or a feature representation; although these types of selective attention are not necessarily mutually exclusive. Research into object-based attention suggests that attention improves the quality of the sensory representation of a selected object, and results in the enhanced processing of that object’s features.

<span class="mw-page-title-main">JASP</span> Free and open-source statistical program

JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science via integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by several universities and research funds. As JASP GUI is developed in C++ using Qt framework, some of the team left to make a notable fork which is Jamovi that its GUI is developed in Javascript and HTML5.

Age of acquisition, is a psycholinguistic variable referring to the age at which a word is typically learned. For example, the word 'penguin' is typically learned at a younger age than the word 'albatross'. Studies in psycholinguistics suggest that age of acquisition has an effect on the speed of reading words. The findings have demonstrated that early-acquired words are processed more quickly than later-acquired words. It is a particularly strong variable in predicting the speed of picture naming. It has been generally found that words that are more frequent, shorter, more familiar and refer to concrete concepts are learned earlier than the counterparts in simple words and compound words, together with their respective lexemes. In addition, the AoA effect has been demonstrated in several languages and in bilingual speakers.

Hal Pashler is Distinguished Professor of Psychology at University of California, San Diego. An experimental psychologist and cognitive scientist, Pashler is best known for his studies of human attentional limitations. and for his work on visual attention He has also developed and tested new methods for enhancing learning and reducing forgetting, focusing on the temporal spacing of learning and retrieval practice.

Intuitive statistics, or folk statistics, is the cognitive phenomenon where organisms use data to make generalizations and predictions about the world. This can be a small amount of sample data or training instances, which in turn contribute to inductive inferences about either population-level properties, future data, or both. Inferences can involve revising hypotheses, or beliefs, in light of probabilistic data that inform and motivate future predictions. The informal tendency for cognitive animals to intuitively generate statistical inferences, when formalized with certain axioms of probability theory, constitutes statistics as an academic discipline.

William D. Timberlake was a psychologist and animal behavior scientist. His work included behavioral economics, contrast effects, spatial cognition, adjunctive behavior, time horizons, and circadian entrainment of feeding and drug use. He is best known for his theoretical work: Behavior Systems Theory and the Disequilibrium Theory of reinforcement.

<span class="mw-page-title-main">Gary Dell</span> American cognitive scientist

Gary S. Dell is an American psycholinguist. He is Professor Emeritus of Psychology and Center for Advanced Study Professor at the University of Illinois at Urbana-Champaign.

References

  1. 1 2 Kruschke, John K. (1992). "ALCOVE: An exemplar-based connectionist model of category learning". Psychological Review. 99 (1): 22–44. doi:10.1037/0033-295X.99.1.22. PMID   1546117.
  2. 1 2 Kruschke, John K. (2015). Doing Bayesian Data Analysis: A tutorial with R, JAGS, and Stan (2nd ed.). Academic Press. ISBN   9780124058880.
  3. 1 2 "Provost Professor Award". Office of the Vice Provost for Faculty & Academic Affairs. Retrieved 2022-05-27.
  4. 1 2 Hinnefeld, Steve (2018-03-19). "IU Bloomington announces Sonneborn Award recipient, Provost Professors". News at IU. Retrieved 2021-10-01.
  5. 1 2 "Troland Research Awards". National Academy of Sciences. Retrieved 22 January 2022.
  6. Kruschke, John K.; Liddell, Torrin M. (2018). "The Bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective". Psychonomic Bulletin & Review. 25 (1): 178–206. doi: 10.3758/s13423-016-1221-4 . PMID   28176294. S2CID   4523799.
  7. Kruschke, John K. (2021). "Bayesian analysis reporting guidelines". Nature Human Behaviour. 5 (10): 1282–1291. doi:10.1038/s41562-021-01177-7. PMC   8526359 . PMID   34400814.
  8. Kruschke, John K. (2018). "Rejecting or Accepting Parameter Values in Bayesian Estimation" (PDF). Advances in Methods and Practices in Psychological Science. 1 (2): 270–280. doi:10.1177/2515245918771304. S2CID   125788648.
  9. Kruschke, John K.; Liddell, Torrin M. (2018). "Bayesian data analysis for newcomers". Psychonomic Bulletin & Review. 25 (1): 155–177. doi: 10.3758/s13423-017-1272-1 . PMID   28405907. S2CID   4117798.
  10. Liddell, Torrin M; Kruschke, John K. (2018). "Analyzing ordinal data with metric models: What could possibly go wrong?" (PDF). Journal of Experimental Social Psychology. 79: 328–348. doi:10.1016/j.jesp.2018.08.009. S2CID   149652068.
  11. Kruschke, John K. (2011). "Models of attentional learning". In Pothos, E. M.; Wills, A. J. (eds.). Formal approaches to categorization (PDF). Cambridge University Press. pp. 120–152. ISBN   9781139493970.
  12. Kruschke, John K. (1989). "Distributed bottlenecks for improved generalization in back-propagation networks" (PDF). International Journal of Neural Networks Research and Applications. 1: 187–193.
  13. Kruschke, John K.; J. R. Movellan, J. R. Movellan (1991). "Benefits of Gain: Speeded learning and minimal hidden layers in back-propagation networks" (PDF). IEEE Transactions on Systems, Man, and Cybernetics. 21: 273–280. doi:10.1109/21.101159.
  14. Nosofsky, R. M. (1986). "Attention, similarity, and the identification-categorization". Journal of Experimental Psychology. 115 (1): 39–57. doi:10.1037/0096-3445.115.1.39. PMID   2937873.
  15. Kruschke, John K.; Johansen, M. K. (1999). "A model of probabilistic category learning". Journal of Experimental Psychology: Learning, Memory, and Cognition. 25 (5): 1083–1119. doi:10.1037/0278-7393.25.5.1083. PMID   10505339.
  16. Kruschke, John K. (1996). "Dimensional relevance shifts in category learning". Connection Science. 8 (2): 201–223. doi:10.1080/095400996116893.
  17. Medin, D. L.; Edelson, S. M. (1988). "Problem structure and the use of base-rate information from experience". Journal of Experimental Psychology: General. 117 (1): 68–85. doi:10.1037/0096-3445.117.1.68. PMID   2966231.
  18. Kruschke, John K. (1996). "Base rates in category learning". Journal of Experimental Psychology: Learning, Memory, and Cognition. 22 (1): 3–26. doi:10.1037/0278-7393.22.1.3. PMID   8648289.
  19. Kruschke, John K. (2001). "The inverse base rate effect is not explained by eliminative inference". Journal of Experimental Psychology: Learning, Memory, and Cognition. 27 (6): 1385–1400. doi:10.1037/0278-7393.27.6.1385. PMID   11713874.
  20. Kruschke, John K. (2009). "Highlighting: A canonical experiment". In Ross, Brian (ed.). The Psychology of Learning and Motivation, Volume 51 (PDF). Vol. 51. Academic Press. pp. 153–185. doi:10.1016/S0079-7421(09)51005-5.
  21. Erickson, M. A.; Kruschke, John K. (1998). "Rules and exemplars in category learning". Journal of Experimental Psychology: General. 127 (2): 107–140. doi:10.1037/0096-3445.127.2.107. PMID   9622910.
  22. Erickson, M. A.; Kruschke, John K. (2002). "Rule-based extrapolation in perceptual categorization". Psychonomic Bulletin & Review. 9 (1): 160–168. doi: 10.3758/BF03196273 . PMID   12026949. S2CID   2388327.
  23. Denton, S. E.; Kruschke, John K.; Erickson, M. A. (2008). "Rule-based extrapolation: A continuing challenge for exemplar models". Psychonomic Bulletin & Review. 15 (4): 780–786. doi: 10.3758/PBR.15.4.780 . PMID   18792504. S2CID   559864.
  24. Kalish, M. L.; Lewandowsky, S. (2004). "Population of Linear Experts: Knowledge Partitioning and Function Learning". Psychological Review. 111 (4): 1072–1099. doi:10.1037/0033-295X.111.4.1072. PMID   15482074.
  25. Kruschke, John K. (2006). "Locally Bayesian Learning with Applications to Retrospective Revaluation and Highlighting". Psychological Review. 113 (4): 677–699. doi:10.1037/0033-295X.113.4.677. PMID   17014300.
  26. Kruschke, John K. (2008). "Bayesian approaches to associative learning: From passive to active learning". Learning & Behavior. 36 (3): 210–226. doi: 10.3758/LB.36.3.210 . PMID   18683466. S2CID   16668044.
  27. Touretzky, D; Hinton, GE; Sejnowski, T, eds. (1989). Proceedings of the 1988 Connectionist Models Summer School (PDF). Morgan Kaufmann. ISBN   978-9999081214.
  28. "Office of the Vice Provost for Faculty and Academic Affairs: Trustees Teaching Award".