Hierarchical Bayesian Cognitive Modeling

Link to Source: link

February 2011

Quick Summary:

Methodology: What was their experimental design? How did they induce their results?
Findings: What did they expose with their research? What was surprising?

Takeaway:

What is the relevance to our use case?
What do they do that we should emulate?

Scratch Notes:

Abstract

Most important benefits of Hierarchical Bayesian modeling:

Development of more complete theories, can accounts for individual differences in cognition
Can account for observed behavior in terms of multiple cognitive processes
Involves using a few key psych variables to explain behavior on a wide range of cognitive tasks
Conceptual unification and integration of disparate cognitive models

1. Introduction

Bayesian statistics provides a compelling and influential framework for representing and processing information.

Multiple ways to conduct Bayesian Analyses
Theoretically

Objective Bayesian - Jaynes (2003)
Subjective Bayesian - de Fineti (1974)

Bayesian statistics represents knowledge and uncertainty about variables in probability distributions. These representations are updated/manipulated using probability theory.
This is distinct from how frequentist / sampling distributions handle uncertainty? (how?)

Bayesian methods can be used statistically, for analyzing data.
Can also bet used theoretically, to guide interpretation of the mind and its inferences

Help at explaining the brain computationally, ignoring mechanisms
Can also be used to relate models of psychological processes to data.
Not intending to create statistical models like a generalized linear model (GLM), but trying to fit some aspect of cognition to behavioral or other data.
Statistical models attempt to infer dependent variables (recall, reaction times), modeling applications infer latent variables (capacities, learning rate)
The authors argue that this approach of using Bayesian methods to fit psychological processes to data is valuable, because they can deal with models of cognition without assumptions.

2. Benefits of using hierarchical Bayes in cognitive modeling

Defining hierarchical Bayes vs non-hierarchical Bayes:

Non-Hierarchical modeling

Any model that is more complicated than the simplest possible type of model, where a set of parameters generate a set of data through a likelihood function .

This is a very limited definition of a non hierarchical model, that encompasses a broad majority of successful models of cognition.
Accommodates Signal Detection Theory. Measures discriminability and a response criterion or bias.
- hits, false alarms, s and n noise trials.
- . Then the likelihood function that formalizes SDT,
Generalized Context Model of category learning.
Multidimensional scaling models provide amapping betwen latent coordinate locations of stimuli and their observed juged pairwise similarities
Ratcliff diffusion model maps from parameters of bias, caution, evidence, and more to accuracy and response times for simple decisions.

Developing deeper theories

Hierarchical models are more complicated, and one example is:

Parameters are generated by some other process , parameterized by , sometimes called hyper-parameters.
In a hierarchical model, it is also important to say how these basic parameters are generated. Instead of just giving a mapping , we need .

The best example of problem for this structure is accomodating individual differences. Clear in cognition and practice, but difficult with non-hierarchical models (im just going to call them NHMs and HMs from now on).

Shiffrin et al. 2008 does this in practice for a memory retention task. is how often memory items are recalled over time, is a memory retention function like and exponential or power function, re starting points, decay rates, and other properties of retention functions. Hierarchically, might be a normal distr with a mean and variance in that describes the dist over staring points and decay rates across individuals.

Other uses:

Memory
Decision-making
Confidence
Emotional states

Kemp et al. 2007 use this HM to model “basic inductive processes in cognitive development that require learning what they term ‘overhypotheses’.”

Overhypotheses: mappings that constrain . “constraints on the hypotheses considered by the learner”.

Linking psychological variables to multiple phenomena

A different sort of HM, where parameters generate many sort of data through a range of likelihood functions . The same psychological variables influence behavior on multiple tasks.

Seeks the unification of cognitive science
could be recognition, could be recall, and so on.
*Essentially saying that people have qualities that are used in different tasks through different f functions. *

Linking psychological phenomena to multiple processes

a hierarchical model that allows for multiple cognitive processes to contribute to a single set of observed data

Processes and associated parameters . They mix to form an observed data through a mixing process parameterized by .
Possibilities for :

Choosing one of the process through probabilities
mixes all of the process together according to proportions
Essentially a melting pot, a large number of processes and parameters blended through a process h

Writing and word selection being a mixture process of different semantic topics. Mixture assumption important to explain semantic context and homonyms. Also showed a study that used it to explain development, showing knowledge of concepts could be described as a mixture of different sorts of behavior and underlying developmental stages.

Unifying different models

Take it one step further, and generate the parameters for the multiple process model

, combination rule governed by .
The parameters are generated by some process governed by

Perhaps most fundamentally, Vanpaemel (2011) argues in this special issue that linking models hierarchically is one way to address the basic Bayesian need to specify theoretically meaningful priors. The key idea is that the prior predictive distribution of the hierarchical part of the model, which indexes different basic models, naturally constitutes a psychologically interpretable prior over those models. This is a powerful idea, running counter to a current prejudice for making priors as uninformative as possible, and deserves to be an active area of research in using hierarchical Bayesian methods to model cognition

Conclusion

NHM models are very popular for modeling, create a task, find some parameters that can fit model to data and behavior, call it a day. But this is limited in its explanatin of humanity.

HM methods can broaden scope of cognitive models.

Model development can be done at multiple levels of abstraction
Allow the use of same psych variables to account for behavior over multiple tasks
Allow data to be understood quantitatively and qualitatively as a mixture
Can unify disparate models, ground specification of priors as well

In short, hierarchical Bayesian approaches demand our accounts of cognition become deeper and better integrated. The aim of this special issue is to provide some concrete examples of the potential of hierarchical Bayes in practice, for models ranging from memory, to category learning, to decision-making. We hope that they are useful early exemplars of what should become an important and widespread way of building and analyzing models of cognition.

❓-> Questions during reading

The Bayesian statistical approach is distinct from how frequentist / sampling distributions handle uncertainty? (how?)

What the hell does this have to do with Bayes Theorem?

🧪 -> Refresh the Info

Did you generally find the overall content understandable or compelling or relevant or not, and why, or which aspects of the reading were most novel or challenging for you and which aspects were most familiar or straightforward?)

Did a specific aspect of the reading raise questions for you or relate to other ideas and findings you’ve encountered, or are there other related issues you wish had been covered?)

Vault

Explorer

Hierarchical-Bayesian-Cognitive-Modeling