Day 12

Review

Why do we study probability in statistics?

Probability

A number between \(0\) and \(1\) that tells us how likely a given “event” is to occur

Probability equal to \(0\) means the event cannot occur

\(P(x)=0\)

Probability equal to \(1\) means the event must occur

\(P(x)=1\)

Probability equal to \(1/2\) means the event is as likely to occur as it is to not occur

\(P(x)=0.5\)

Probability close to 0 (but not equal to 0) means the event is very unlikely to occur

The event may still occur, but we’d tend to be surprised if it did

Probability close to 1 (but not equal to 1) means the event is very likely to occur

The event may not occur, but we’d tend to be surprised if it didn’t

Terminology

Experiment (in context of probability):

An activity that results in a definite outcome where the observed outcome is determined by chance

Sample space:

The set of ALL possible outcomes of an experiment; denoted by \(S\)

Flip a coin once

\[S = \{H, T\}\]

Randomly select a person and then determine blood type

\[S = \{A, B, AB, O\}\]

*Event**

A subset of outcomes belonging to sample space \(S\)
A capital letter towards the beginning of the alphabet is used to denote an event
- i.e. \(A\), \(B\), \(C\), etc.

Suppose we flip a coin twice

\[S = \{HH, HT, TH, TT\}\]

Let \(A\) be the event we observe at least one tails

\[A = \{HT, TH, TT\}\]

Let \(B\) be the event we observe at most one tails

\[B = \{HH, HT, TH\}\]

Simple event: An event containing a single outcome in the sample space \(S\)

\[S = \{HH, HT, TH, TT\}\]

\(A = \text{we observe two heads} = \{HH\}\)

Simple event

Compound event: An event formed by combining two or more events (thereby containing two or more outcomes in the sample space \(S\))

\[S = \{HH, HT, TH, TT\}\]

\(B = \text{we observe a head in the first or in the second flip} = \{HT, TH, HH\}\)

Compound event

Probability Methods

Subjective Probability

Probability is assigned based on judgement or experience

i.e. expert opinion, personal experience, “vibe math”

Classical Probability

Make some assumptions in order to build a mathematical model from which we can derive probabilities

It’s not vibe math but it can definitely feel like it

\[P(A) = \frac{\text{number of outcomes in event } A}{\text{total number of outcomes in } S}\]

Relative or Empirical Probability

Think of the probability of an event as the proportion of times that the event occurs

\[P(x) \approx \frac{\text{number of times x is observed}}{\text{number of samples}}\]

Law of Large Numbers

As the size of our sample (i.e., number of experiments) gets larger and larger:

The relative frequency of the event of our interest gets closer and closer to the true probability

Questions?

**Goals for Today:

Introduce simple probability set theory
Define fundamental rules and laws for probability mathematics

Probability

Basic Concepts of Probability II

You decide you want to go to Dirty Dawgs this Friday to try and pretend you don’t have midterms next week. You’ve heard that a decent number of your friends who’ve gone in the past 3 weeks have ended up sick with some kind of Flu/Covid. This doesn’t stop you.

How can you measure the likelihood that you end up sick as well, and what does that measurement tell us about the possible outcomes of your night out?

Probability model: Assigns a probability to each possible event constructed from the simple events in a particular sample space describing a particular experiment

For a finite sample space with \(n\) simple events, e.g. \(S = \{E_1, E_2, \dots, E_n\}\):

The probability model assigns a number \(p_i\) to event \(E_i\) where \(P(E_i) = p_i\) so that:

\[0 \leq p_i \leq 1\]

\[p_1 + p_2 + \dots + p_n = 1 \ \ \text{(as a consequence, }P(S) = 1\text{)}\]

For an equally-likely probability model, the probability of observing \(E_i\) is:

\[P(E_i) = p_i = \frac{1}{n}\]

If \(A\) is an event in an equally-likely sample space \(S\) and contains \(k\) outcomes, then:

\[P(A) = \frac{\text{No. of outcomes in } A}{\text{No. of outcomes in } S} = \frac{k}{n}\]

You take a count of your friends and their friends who’ve been to Dawgs in the past 3 weeks and whether they were sick the next day. You list anyone who wasn’t sick as “Not Sick”, anyone who was briefly sick the next day as “Maybe Sick”, anyone who was sick for three or more days after as “Likely Sick”, and anyone who ended up at LaFene getting antivirals/antibiotics as “Definitely Sick”.

\[ \begin{array}{|c|c|c|c|c|c|} \hline \textbf{Status} & \textbf{Not Sick} & \textbf{Maybe Sick} & \textbf{Likely Sick} & \textbf{Definitely Sick} & \textbf{Total}\\ \hline \text{Count} & 7 & 13 & 20 & 10 & 50 \\ \hline \end{array} \]

You run an experiment with this data of selecting one individual and figuring out what their “Sick Status” was after going to Dirty Dawgs

You make a random selection so that everyone has the same chance of being selected

What kind of sample is this?

The sample space \(S\) is the set of all 50 individuals

The probability model for this experiment is given as follows:

If \(A\) is an event in \(S\), then the event where \(A\) does not occur is called the complement of \(A\)

Denote the complement of \(A\) by \(A^c\) – read this as “A-complement”

Complement Rule

\[P(A^c) = 1 - P(A) \quad \text{or} \quad P(A) = 1 - P(A^c)\]

This rule is useful when \(P(A)\) is difficult to calculate but \(P(A^c)\) is easy (or vice versa)

Suppose we roll a fair 6-sided die twice, then \(S\) contains 36 equally-likely outcomes in the form of 36 ordered pairs, i.e. \((1, 1), (1, 2), \dots, (6, 5), (6, 6)\)

Let \(A\) be “roll doubles”
- Then \(P(A) = \frac{6}{36} = \frac{1}{6}\)

\(A^c\) is the event we “do not roll doubles”, and:

\[P(A^c) = 1 - P(A) = 1 - \frac{1}{6} = \frac{5}{6}\]

We could have counted the number of non-doubles in \(S\), but this requires more effort

This rule can also let us circumvent excess arithmetic

What’s the probability that you won’t end up “Definitely Sick”?

\[0.14 + 0.26 + 0.40 = 0.80\]

\[1.00 - 0.20 = 0.80\]

Unions and Intersections

The union of two events \(A\) and \(B\), denoted \(A \cup B\), are all outcomes that belong to \(A\), \(B\), or both

Saying \(A \cup B\) is equivalent to saying “A or B”

The intersection of two events \(A\) and \(B\), denoted \(A \cap B\), are all outcomes that belong to both \(A\) and \(B\)

Saying \(A \cap B\) is equivalent to saying “A and B”

In rolling a die once, consider events \(A\) and \(B\):

\(A\): Roll an even number: \(\{2, 4, 6\}\)
\(B\): Roll a number greater than 4: \(\{5, 6\}\)

\[A \cup B = A \text{ or } B = \{2, 4, 5, 6\}\]

\[A \cap B = A \text{ and } B = \{6\}\]

You’re not certain that Dirty Dawgs is the cause of the issue. You decide to double check by expanding you sample with 150 individuals who spent most of their time at other bars.

\[ \begin{array}{|c|c|c|c|c|c|} \hline \textbf{} & \textbf{Not Sick} & \textbf{Maybe Sick} & \textbf{Likely Sick} & \textbf{Definitely Sick} & \textbf{Total}\\ \hline \text{Dawgs} & 7 & 13 & 20 & 10 & 50 \\ \hline \text{Yard Bar} & 17 & 20 & 7 & 13 & 54 \\ \hline \text{Kaw's} & 7 & 6 & 13 & 9 & 35 \\ \hline \text{Tubby's} & 0 & 19 & 20 & 19 & 58 \\ \hline \text{Total} & 31 & 58 & 60 & 51 & 200 \\ \hline \end{array} \]

You select an individual at random

What’s the probability that person went to Yard Bar?

What’s the probability that person went to Dirty Dawgs or Yard Bar?

Whats the probability that person went to Tubby’s and was definitely sick?

Mutual Exclusivity

Two events \(A\) and \(B\) are mutually exclusive if they do not share any common outcomes

Roll a die:

\(A\): Roll a 1 or a 2: \(\{1, 2\}\)
\(B\): Roll an even number: \(\{2, 4, 6\}\)
\(C\): Roll a 3, 4, or 5: \(\{3, 4, 5\}\)

Events \(A\) and \(C\) are mutually exclusive:

A \(1\) was rolled
- Thus, none of the events in \(C\) could have occurred

Events \(A\) and \(B\) are not mutually exclusive:

\[A \text{ and } B = \{2\}\]

Addition rule for mutually exclusive events

Typically:

\[P(A\cup B)=P(A)+P(B)-P(A\cap B)\]

Given:

\[A \text{ and } B \Rightarrow \text{ Mutually Exclusive}\]

\[P(A \text{ or } B) = P(A) + P(B)\]

\[P(A \cup B) = P(A) + P(B)\]

What events in this sample space could be considered mutually exclusive?

Let \(A = \{\text{Someone went to Tubby's and didn't get sick}\}\)

Let \(B = \{\text{Someone went to Kaw's}\}\)

Find \(P(A\cup B)\)

Conditional Probability

A conditional probability of an event is a probability obtained with the additional information that some other event has already occurred

\(P(A|B)\) denotes the conditional probability of event \(A\) given that event \(B\) has already occurred

\[P(A|B) = \frac{P(A \text{ and } B)}{P(B)}\]

\[P(A|B) = \frac{P(A \cap B)}{P(B)}\]

Similarly:

\[P(B|A) = \frac{P(A \text{ and } B)}{P(A)}\]

\[P(B|A) = \frac{P(A \cap B)}{P(A)}\]

An economist predicts a 60% chance that stock \(A\) will perform poorly and a 25% chance that stock \(B\) will perform poorly. There is also a 16% chance that both stocks will perform poorly

What is the probability that stock \(A\) performs poorly given that stock \(B\) performs poorly?

Select an individual at random

What’s the probability that someone went to Dirty Dawgs?

What’s the probability that someone went to Dirty Dawgs and was “Likely Sick”?

What’s the probability that someone is “Likely Sick” given that they went to Dirty Dawgs?

What’s the probability that someone went to Dirty Dawgs, given that they’re “Likely Sick”?

Multiplication Rule

Using the definition of conditional probability:

\[ P(A|B) = \frac{P(A \cap B)}{P(B)}, \]

We can do some simple algebra and find ourselves at the multiplication rule:

\[P(A \cap B) = P(A|B) P(B)\]

Independence

Events \(A\) and \(B\) are independent if the outcome of \(A\) does not affect the outcome of \(B\) and vice versa

In terms of conditional probability:

The probability of \(A\) does not change given \(B\) happened and vice versa

That is, \(A\) and \(B\) are independent if one of the following is true:

\[P(A|B) = P(A)\]

\[P(B|A) = P(B)\]

\[P(A \cap B) = P(A)P(B)\]

(You can show that all three statements are equivalent)

Multiplication Rule for Independent Events

Given:

\[A \text{ and } B \Rightarrow \text{ Independent}\]

\[P(A \cap B) = P(A)P(B)\]

Suppose we roll a fair die twice. What is the probability that the first roll is a 1 and the second roll is a 6?

Suppose that every individual in this sample spent their entire night out at the bar they’re associated with in the data set, and had no interaction with one another

Select two individuals at random

Let \(A=\{\text{Individual 1 went to Tubby's and was some form of sick}\}\)

Let \(B=\{\text{Individual 2 went to Yard Bar and was Definitely Sick}\}\)

Find \(P(B|A)\)

Does this make sense within our example? Why / Why not?

Does this make sense in real life? Why / Why not?

Day 12

Review

Probability

Terminology

Probability Methods

Law of Large Numbers

Probability

Basic Concepts of Probability II

Complement Rule

Unions and Intersections

Mutual Exclusivity

Addition rule for mutually exclusive events

Conditional Probability

Multiplication Rule

Independence

Multiplication Rule for Independent Events

Attendance QOTD

Go away