Bayesian inference in marketing

Last updated
Bayes' Theorem Bayes' Theorem MMB 01.jpg
Bayes' Theorem

In marketing, Bayesian inference allows for decision making and market research evaluation under uncertainty and with limited data. The communication between marketer and market can be seen as a form of Bayesian persuasion.

Contents

Introduction

Bayes' theorem is fundamental to Bayesian inference. It is a subset of statistics, providing a mathematical framework for forming inferences through the concept of probability, in which evidence about the true state of the world is expressed in terms of degrees of belief through subjectively assessed numerical probabilities. Such a probability is known as a Bayesian probability. The fundamental ideas and concepts behind Bayes' theorem, and its use within Bayesian inference, have been developed and added to over the past centuries by Thomas Bayes, Richard Price and Pierre Simon Laplace as well as numerous other mathematicians, statisticians and scientists. [1] Bayesian inference has experienced spikes in popularity as it has been seen as vague and controversial by rival frequentist statisticians. [2] In the past few decades Bayesian inference has become widespread in many scientific and social science fields such as marketing. Bayesian inference allows for decision making and market research evaluation under uncertainty and limited data. [3] [ self-published source? ]

Bayes' theorem

Bayesian probability specifies that there is some prior probability. Bayesian statisticians can use both an objective and a subjective approach when interpreting the prior probability, which is then updated in light of new relevant information. The concept is a manipulation of conditional probabilities: [3]

Alternatively, a more simple understanding of the formula may be reached by substituting the events and to become respectively the hypothesis and the data . The rule allows for a judgment of the relative truth of the hypothesis given the data. [3]

This is done through the calculation shown below, where is the likelihood function. This assesses the probability of the observed data arising from the hypothesis ; is the assigned prior probability or initial belief about the hypothesis; the denominator is formed by the integrating or summing of ; is known as the posterior which is the recalculated probability, or updated belief about the hypothesis. It is a result of the prior beliefs as well as sample information. The posterior is a conditional distribution as the result of collecting or in consideration of new relevant data. [3]

To sum up this formula: the posterior probability of the hypothesis is equal to the prior probability of the hypothesis multiplied by the conditional probability of the evidence given the hypothesis, divided by the probability of the new evidence. [4]

Use in marketing

History

While the concepts of Bayesian statistics are thought to date back to 1763, marketers' exposure to the concepts are relatively recent, dating from 1959. [2] Subsequently, many books [5] [6] [7] and articles [8] [9] have been written about the application of Bayesian statistics to marketing decision-making and market research. It was predicted that the Bayesian approach would be used widely in the marketing field but up until the mid-1980s the methods were considered impractical. [10] The resurgence in the use of Bayesian methods is largely due to the developments over the last few decades in computational methods; and expanded availability of detailed marketplace data – primarily due to the birth of the World Wide Web and explosion of the internet.

Application in marketing

Bayesian decision theory can be applied to all four areas of the marketing mix. [11] Assessments are made by a decision maker on the probabilities of events that determine the profitability of alternative actions where the outcomes are uncertain. Assessments are also made for the profit (utility) for each possible combination of action and event. The decision maker can decide how much research, if any, needs to be conducted in order to investigate the consequences associated with the courses of action under evaluation. This is done before a final decision is made, but in order to do this costs would be incurred, time used and may overall be unreliable. For each possible action, expected profit can be computed, that is a weighted mean of the possible profits, the weights being the probabilities. The decision maker can then choose the action for which the expected profit is the highest. The theorem provides a formal reconciliation between judgment expressed quantitatively in the prior distribution and the statistical evidence of the experiment.

New product development

The use of Bayesian decision theory in new product development allows for the use of subjective prior information. Bayes in new product development allows for the comparison of additional review project costs with the value of additional information in order to reduce the costs of uncertainty. The methodology used for this analysis is in the form of decision trees and 'stop'/'go' procedures. If the predicted payoff (the posterior) is acceptable for the organisation the project should go ahead, if not, development should stop. By reviewing the posterior (which then becomes the new prior) on regular intervals throughout the development stage managers are able to make the best possible decision with the information available at hand. Although the review process may delay further development and increase costs, it can help greatly to reduce uncertainty in high risk decisions.

Pricing decisions

Bayesian decision theory can be used in looking at pricing decisions. Field information such as retail and wholesale prices as well as the size of the market and market share are all incorporated into the prior information. Managerial judgement is included in order to evaluate different pricing strategies. This method of evaluating possible pricing strategies does have its limitations as it requires a number of assumptions to be made about the market place in which an organisation operates. As markets are dynamic environments it is often difficult to fully apply Bayesian decision theory to pricing strategies without simplifying the model.[ citation needed ]

Promotional campaigns

When dealing with promotion a marketing manager must account for all the market complexities that are involved in a decision. As it is difficult to account for all aspects of the market, a manager should look to incorporate both experienced judgements from senior executives as well modifying these judgements in light of economically justifiable information gathering. An example of the application of Bayesian decision theory for promotional purposes could be the use of a test sample in order to assess the effectiveness of a promotion prior to a full scale rollout. By combining prior subjective data about the occurrence of possible events with experimental empirical evidence gained through a test market, the resultant data can be used to make decisions under risk.[ citation needed ]

Channel decisions and the logistics of distribution

Bayesian decision analysis can also be applied to the channel selection process. In order to help provide further information the method can be used that produces results in a profit or loss aspect. Prior information can include costs, expected profit, training expenses and any other costs relevant to the decision as well as managerial experience which can be displayed in a normal distribution. Bayesian decision making under uncertainty lets a marketing manager assess his/her options for channel logistics by computing the most profitable method choice. A number of different costs can be entered into the model that helps to assess the ramifications of change in distribution method. Identifying and quantifying all of the relevant information for this process can be very time-consuming and costly if the analysis delays possible future earnings.[ citation needed ]

Strengths

The Bayesian approach is superior to use in decision making when there is a high level of uncertainty or limited information in which to base decisions on and where expert opinion or historical knowledge is available. Bayes is also useful when explaining the findings in a probability sense to people who are less familiar and comfortable with comprehending statistics. It is in this sense that Bayesian methods are thought of as having created a bridge between business judgments and statistics for the purpose of decision-making. [12]

The three principle strengths of Bayes' theorem that have been identified by scholars are that it is prescriptive, complete and coherent. [13] Prescriptive in that it is the theorem that is the simple prescription to the conclusions reached on the basis of evidence and reasoning for the consistent decision maker. It is complete because the solution is often clear and unambiguous, for a given choice of model and prior distribution. It allows for the incorporation of prior information when available to increase the robustness of the solutions, as well as taking into consideration the costs and risks that are associated with choosing alternative decisions. [14]

Lastly Bayes theorem is coherent. It is considered the most appropriate way to update beliefs by welcoming the incorporation of new information, as is seen through the probability distributions (see Savage [15] and De Finetti [16] ). This is further complemented by the fact that Bayes inference satisfies the likelihood principle, [17] which states that models or inferences for datasets leading to the same likelihood function should generate the same statistical information.

Bayes methods are more cost-effective than the traditional frequentist take on marketing research and subsequent decision making. The probability can be assessed from a degree of belief before and after accounting for evidence, instead of calculating the probabilities of a certain decision by carrying out a large number of trials with each one producing an outcome from a set of possible outcomes. The planning and implementation of trials to see how a decision impacts in the 'field' e.g. observing consumers reaction to a relabeling of a product, is time-consuming and costly, a method many firms cannot afford. In place of taking the frequentist route in aiming for a universally acceptable conclusion through iteration, [18] it is sometimes more effective to take advantage of all the information available to the firm to work out the 'best' decision at the time, and then subsequently when new knowledge is obtained, revise the posterior distribution to be then used as the prior, thus the inferences continue to logically contribute to one another based on Bayes theorem. [19]

Weaknesses

In marketing situations, it is important that the prior probability is (1) chosen correctly, and (2) is understood. A disadvantage to using Bayesian analysis is that there is no 'correct' way to choose a prior, therefore the inferences require a thorough analysis to translate the subjective prior beliefs into a mathematically formulated prior to ensure that the results will not be misleading and consequently lead to the disproportionate analysis of preposteriors. [2] The subjective definition of probability and the selection and use of the priors have led to statisticians critiquing this subjective definition of probability that underlies the Bayesian approach. [13]

Bayesian probability is often found to be difficult when analysing and assessing probabilities due to its initial counter intuitive nature. Often when deciding between strategies based on a decision, they are interpreted as: where there is evidence X that shows condition A might hold true, is misread by judging A's likelihood by how well the evidence X matches A, but crucially without considering the prior frequency of A. [13] In alignment with Falsification, which aims to question and falsify instead of prove hypotheses, where there is very strong evidence X, it does not necessarily mean there is a very high probability that A leads to B, but in fact should be interpreted as a very low probability of A not leading to B.

In the field of marketing, behavioural experiments which have dealt with managerial decision-making, [20] and risk perception, [21] [22] in consumer decisions have utilised the Bayesian model, or similar models, but found that it may not be relevant quantitatively in predicting human information processing behaviour. Instead the model has been proven as useful as a qualitative means of describing how individuals combine new evidence with their predetermined judgements. Therefore, "the model may have some value as a first approximation to the development of descriptive choice theory" in consumer and managerial instances. [2]

Example

An advertising manager is deciding whether or not to increase the advertising for a product in a particular market. The Bayes approach to this decision suggests: 1) These alternative courses of action for which the consequences are uncertain are a necessary condition in order to apply Bayes'; 2) The advertising manager will pick the course of action which allows him to achieve some objective i.e. a maximum return on his advertising investment in the form of profit; 3) He must determine the possible consequences of each action into some measure of success (or loss) with which a certain objective is achieved.

This 3 component example explains how the payoffs are conditional upon which outcomes occur. The advertising manager can characterize the outcomes based on past experience and knowledge and devise some possible events that are more likely to occur than others. He can then assign to these events prior probabilities, which would be in the form of numerical weights. [23]

He can test out his predictions (prior probabilities) through an experiment. For example, he can run a test campaign to decide if the total level of advertising should be in fact increased. Based on the outcome of the experiment he can re-evaluate his prior probability and make a decision on whether to go ahead with increasing the advertising in the market or not. However gathering this additional data is costly, time-consuming and may not lead to perfectly reliable results. As a decision makers he has to deal with experimental and systematic error and this is where Bayes' comes in.

It approaches the experimental problem by asking; is additional data required? If so, how much needs to be collected and by what means and finally, how does the decision maker revise his prior judgment in light of the results of the new experimental evidence? In this example the advertising manager can use the Bayesian approach to deal with his dilemma and update his prior judgments in light of new information he gains. He needs to take into account the profit (utility) attached to the alternative acts under different events and the value versus cost of information in order to make his optimal decision on how to proceed.

Bayes in computational models

Markov chain Monte Carlo (MCMC) is a flexible procedure designed to fit a variety of Bayesian models. It is the underlying method used in computational software such as the LaplacesDemon R Package and WinBUGS. The advancements and developments of these types of statistical software have allowed for the growth of Bayes by offering ease of calculation. This is achieved by the generation of samples from the posterior distributions, which are then used to produce a range of options or strategies which are allocated numerical weights. MCMC obtains these samples and produces summary and diagnostic statistics while also saving the posterior samples in the output. The decision maker can then assess the results from the output data set and choose the best option to proceed. [19]

Related Research Articles

Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.

<span class="mw-page-title-main">Frequentist probability</span> Interpretation of probability

Frequentist probability or frequentism is an interpretation of probability; it defines an event's probability as the limit of its relative frequency in many trials . Probabilities can be found by a repeatable objective process. The continued use of frequentist methods in scientific inference, however, has been called into question.

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

In probability theory and statistics, Bayes' theorem, named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to an individual of a known age to be assessed more accurately by conditioning it relative to their age, rather than assuming that the individual is typical of the population as a whole.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Fundamentally, Bayesian inference uses prior knowledge, in the form of a prior distribution in order to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

In statistics, interval estimation is the use of sample data to estimate an interval of possible values of a parameter of interest. This is in contrast to point estimation, which gives a single value.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

A prior probability distribution of an uncertain quantity, often simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable.

The Bayes factor is a ratio of two competing statistical models represented by their evidence, and is used to quantify the support for one model over the other. The models in question can have a common set of parameters, such as a null hypothesis and an alternative, but this is not necessary; for instance, it could also be a non-linear model compared to its linear approximation. The Bayes factor can be thought of as a Bayesian analog to the likelihood-ratio test, although it uses the integrated likelihood rather than the maximized likelihood. As such, both quantities only coincide under simple hypotheses. Also, in contrast with null hypothesis significance testing, Bayes factors support evaluation of evidence in favor of a null hypothesis, rather than only allowing the null to be rejected or not rejected.

A marginal likelihood is a likelihood function that has been integrated over the parameter space. In Bayesian statistics, it represents the probability of generating the observed sample for all possible values of the parameters; it can be understood as the probability of the model itself and is therefore often referred to as model evidence or simply evidence.

<span class="mw-page-title-main">Credible interval</span> Concept in Bayesian statistics

In Bayesian statistics, a credible interval is an interval used to characterize a probability distribution. It is defined such that an unobserved parameter value has a particular probability to fall within it. For example, in an experiment that determines the distribution of possible values of the parameter , if the probability that lies between 35 and 45 is 0.95, then is a 95% credible interval.

Lindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a hypothesis testing problem give different results for certain choices of the prior distribution. The problem of the disagreement between the two approaches was discussed in Harold Jeffreys' 1939 textbook; it became known as Lindley's paradox after Dennis Lindley called the disagreement a paradox in a 1957 paper.

The foundations of statistics consists of the mathematical and philosophical basis for arguments and inferences made using statistics. This includes the justification for methods of statistical inference, estimation and hypothesis testing, the quantification of uncertainty in the conclusions of statistical arguments, and the interpretation of those conclusions in probabilistic terms. A valid foundation can be used to explain statistical paradoxes such as Simpson's paradox, provide a precise description of observed statistical laws, and guide the application of statistical conclusions in social and scientific applications.

Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or proportion of findings in the data. Frequentist inference underlies frequentist statistics, in which the well-established methodologies of statistical hypothesis testing and confidence intervals are founded.

Bayesian econometrics is a branch of econometrics which applies Bayesian principles to economic modelling. Bayesianism is based on a degree-of-belief interpretation of probability, as opposed to a relative-frequency interpretation.

In Bayesian inference, the Bernstein–von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value:

Bayesian epistemology is a formal approach to various topics in epistemology that has its roots in Thomas Bayes' work in the field of probability theory. One advantage of its formal method in contrast to traditional epistemology is that its concepts and theorems can be defined with a high degree of precision. It is based on the idea that beliefs can be interpreted as subjective probabilities. As such, they are subject to the laws of probability theory, which act as the norms of rationality. These norms can be divided into static constraints, governing the rationality of beliefs at any moment, and dynamic constraints, governing how rational agents should change their beliefs upon receiving new evidence. The most characteristic Bayesian expression of these principles is found in the form of Dutch books, which illustrate irrationality in agents through a series of bets that lead to a loss for the agent no matter which of the probabilistic events occurs. Bayesians have applied these fundamental principles to various epistemological topics but Bayesianism does not cover all topics of traditional epistemology. The problem of confirmation in the philosophy of science, for example, can be approached through the Bayesian principle of conditionalization by holding that a piece of evidence confirms a theory if it raises the likelihood that this theory is true. Various proposals have been made to define the concept of coherence in terms of probability, usually in the sense that two propositions cohere if the probability of their conjunction is higher than if they were neutrally related to each other. The Bayesian approach has also been fruitful in the field of social epistemology, for example, concerning the problem of testimony or the problem of group belief. Bayesianism still faces various theoretical objections that have not been fully solved.

Likelihoodist statistics or likelihoodism is an approach to statistics that exclusively or primarily uses the likelihood function. Likelihoodist statistics is a more minor school than the main approaches of Bayesian statistics and frequentist statistics, but has some adherents and applications. The central idea of likelihoodism is the likelihood principle: data are interpreted as evidence, and the strength of the evidence is measured by the likelihood function. Beyond this, there are significant differences within likelihood approaches: "orthodox" likelihoodists consider data only as evidence, and do not use it as the basis of statistical inference, while others make inferences based on likelihood, but without using Bayesian inference or frequentist inference. Likelihoodism is thus criticized for either not providing a basis for belief or action, or not satisfying the requirements of these other schools.

In Bayesian statistics, the probability of direction (pd) is a measure of effect existence representing the certainty with which an effect is positive or negative. This index is numerically similar to the frequentist p-value.

References

  1. McGrayne, Sharon Bertsch (2011). The Theory That Would Not Die. Yale University Press. ISBN   978-0-300-17509-7.[ page needed ]
  2. 1 2 3 4 Green, Paul E.; Frank, Ronald E. (1966). "Bayesian Statistics and Marketing Research". Journal of the Royal Statistical Society. Series C (Applied Statistics). 15 (3): 173–190. doi:10.2307/2985299. JSTOR   2985299.
  3. 1 2 3 4 Olshausen, Bruno A. (1 March 2004). "Bayesian probability theory" (PDF).
  4. Paulos, John Allen (5 August 2011). "The Mathematics of Changing Your Mind". The New York Times.
  5. Chernoff, H. and Moses, L. E. (1959). Elementary Decision Theory. New York: Wiley; London: Chapman & Hall[ page needed ]
  6. Schlaifer, R. (1959). Probability and Statistics for Business Decisions, New York: McGraw Hill[ page needed ]
  7. Rossi, P. E., Allenby, G. M. and McCulloch, R. (2005). Bayesian Statistics and Marketing, New York: Wiley[ page needed ]
  8. Roberts, Harry V. (1960). "The New Business Statistics". The Journal of Business. 33 (1): 21–30. doi:10.1086/294306. JSTOR   2350532.
  9. Pratt, John W.; Raiffa, Howard; Schlaifer, Robert (June 1964). "The Foundations of Decision under Uncertainty: An Elementary Exposition". Journal of the American Statistical Association. 59 (306): 353–375. doi:10.1080/01621459.1964.10482164.
  10. Rossi, Peter E.; Allenby, Greg M. (August 2003). "Bayesian Statistics and Marketing". Marketing Science. 22 (3): 304–328. doi:10.1287/mksc.22.3.304.17739.
  11. Alderson, W., Green, P. E. (1964) Planning and Problem Solving in Marketing. Richard D. Irwin Inc Illinois[ page needed ]
  12. Roberts, Harry V. (January 1963). "Bayesian Statistics in Marketing". Journal of Marketing. 27 (1): 1–4. doi:10.1177/002224296302700101. S2CID   167494436.
  13. 1 2 3 Little, Roderick J (August 2006). "Calibrated Bayes: A Bayes/Frequentist Roadmap". The American Statistician. 60 (3): 213–223. doi:10.1198/000313006X117837. S2CID   53505632.
  14. Wald, Abraham (1993). "Statistical Decision Functions". In Kotz, Samuel; Johnson, Norman L. (eds.). Breakthroughs in Statistics: Foundations and Basic Theory. Springer Science & Business Media. pp. 342–357. ISBN   978-0-387-94037-3.
  15. Savage, L. J. (1954). The Foundations of Statistics, New York: Wiley[ page needed ]
  16. De Finetti, B. (1974). The Theory of Probability, New York: Wiley[ page needed ]
  17. Birnbaum, Allan (1962). "On the Foundations of Statistical Inference". Journal of the American Statistical Association. 57 (298): 269–306. doi:10.2307/2281640. JSTOR   2281640.
  18. Efron, Bradley (March 2005). "Bayesians, Frequentists, and Scientists". Journal of the American Statistical Association. 100 (469): 1–5. CiteSeerX   10.1.1.179.1454 . doi:10.1198/016214505000000033. S2CID   123082238. ProQuest   274829688.
  19. 1 2 SAS Institute Inc. (2009). SAS/STAT® 9.2 User's Guide, Second Edition, Cary, NC: SAS Institute Inc.
  20. Green, Paul E.; Peters, William S.; Robinson, Patrick J. (February 1966). "A Behavioral Experiment In Decision-Making Under Uncertainty". Journal of Purchasing. 2 (1): 18–31. doi:10.1111/j.1745-493X.1966.tb00819.x.
  21. Baur, R. A. (1960). "Consumer Behaviour as Risk Taking". Dynamic marketing for a changing world: proceedings of the 43rd national conference of the American Marketing Association, June 15, 16, 17, 1960. American Marketing Association. pp. 389–398. OCLC   2326074.
  22. Cox, Donald F.; Rich, Stuart U. (November 1964). "Perceived Risk and Consumer Decision-Making—The Case of Telephone Shopping". Journal of Marketing Research. 1 (4): 32–39. doi:10.1177/002224376400100405. S2CID   167671294.
  23. Green, Paul E. (1976). "Bayesian Decision Theory in Advertising". Mathematical Models in Marketing. Lecture Notes in Economics and Mathematical Systems. Vol. 132. pp. 234–235. doi:10.1007/978-3-642-51565-1_75. ISBN   978-3-540-07869-2.

Further reading