Chapter Summary

7.5. Chapter Summary#

Key Take-Aways

Bayes’ Rule

A stochastic system is a system for which the outputs are not a deterministic function of the inputs.
Sensitivity is the probability of detecting some phenomena when it is actually there.
Given a system with inputs \(A_i\) and outputs \(B_j\):
- \(P(A_i)\) is an a priori probability (or prior).
- \(P(B_j|A_i)\) is a likelihood.
- \(P(A_i|B_j)\) is an a posteriori probability (APP, also called a posterior).
We are often given likelihoods and a priori probabilities. A posteriori probabilities usually have to be calculated using Bayes’ Rule.
Bayes Rule:

\[ P(A_i|B_j) = \frac{P\left(B_j \left \vert A_i \right. \right) P\left( A_i \right)} { \sum_i P\left(B_j \left \vert A_i \right. \right) P\left( A_i \right)}. \]

The base rate fallacy is that people tend to ignore a priori information (base rates) when evaluating data. This is especially problematic when the a priori probabilities for some events are close to 0 or 1. To illustrate this, we studied an example about breast cancer detection. Mammograms are very reliable for detecting breast cancer, but for younger women, where the probability of a woman having breast cancer is low, most cases where breast cancer is detected will turn out to be false alarms.

Systems with Hidden State

A system with hidden state has internal state (for example, memory) that is not directly observable but that can affect the system’s outputs.
Even though the hidden state in a system is not directly observable, information about that state can be determined using the inputs and outputs of the system.
We showed that in the Magician’s Coin problem, the coin that was selected can be considered to be hidden state, and we showed that Bayes Rule can be used to determine how observations of the system output change the probabilities of the hidden state.

Optimal Decisions for Discrete Stochastic Systems

A common problem in stochastic systems is to estimate a system’s input based on an observation of the output. For example, almost all communication systems fit this model.
We showed how to formulate the optimal decision problem for such systems in two common ways: maximum likelihood (ML) decisions and maximum a posteriori probability (MAP) decisions.
Consider a system with inputs \(A_i\) and outputs \(B_j\). If the output \(B_j\) is observed, then:
- The ML decision rule is \begin{equation*} \widehat{A}i, \mbox{ where } i = \arg \max{i \in {0,1}} P(B_j|A_i). \end{equation*}
- The MAP decision rule is \begin{equation*} \hat{A}i, ~ \mbox{where } i = \arg \max{i \in {0,1} } P \left( A_i \vert B_j \right). \end{equation*}
The MAP decision rule usually offers the best performance but requires knowing the a priori probabilities. If the a priori probabilities are not known, the ML decision rule is usually used.

Bayesian Hypothesis Testing and Confidence Intervals

Bayesian statistics can easily be created using bootstrap resampling to estimate the a posteriori probabilities.
When calculating Bayesian statistics, the a priori probabilities (or prior) must be chosen by the statistician.
Priors may either be:
- Uninformative, which usually means that the probability is distributed equally or uniformly among the values
- Informative, in which the statistician uses information from other sources to choose a reasonable prior
Credible intervals are the set of values that contain C% of the total a posteriori probability.
Unlike confidence intervals, credible intervals have a more straightforward interpretation: given a C% credible interval for some parameter of interest, there is a C% probability that the true value will lie within the credible interval (given the data and choice of prior).
Bayesian statistics are gaining in popularity but can be subject to criticism because they require the statistician to specify a set of a priori probabilities.

Self Assessment Exercises

Answer the questions below to assess your understanding of the material in this chapter. Because there are many questions for this chapter, a random sample of 12 questions are shown. Reload this page to get a new random sample of 12 questions.

Liver Cancer Test Problems 2

The following problems all rely on the following information – note that the numbers for this review are different than the numbers used earlier in this chapter:

A new computerized scan of the liver is used to help identify whether a person has a normal liver (\(N\)), benign tumors (\(B\)), or liver cancer (\(C\)). The computer will automatically classify the output to indicate tumors (\(T\)) or no tumors (\(\overline{T}\)).

For a normal liver, the computer will indicate that there are NO tumors with probability 0.95. For a benign tumor, the computer will indicate that there are tumors with probability 0.25. For a cancerous liver, the computer will indicate that there are tumors with probability 0.99.

If the test is repeated, the scan will observe the liver at a different angle, and the results will be conditionally independent given the patient’s true condition (ie., \(N\), \(B\), or \(C\)).

A healthy patient is one who has no other indications of liver tumors or cancer before having the computerized scan. Such a patient has

probability 0.98 of having a normal liver,
probability 0.018 of having benign liver tumors,
and probability 0.002 of having liver cancer.

Terminology Review

Use the flashcards below to help you review the terminology introduced in this chapter. \(~~~~ ~~~~ ~~~~\)

Spaced Repetition

Use these questions to review material from previous chapter: