Day 13

Review


\[ \begin{array}{|c|c|} \hline \text{Complement} & \text{Union} & \text{Intersection} & \text{Conditional} \\ \hline P(A^c)=1-P(A) & P(A \text{ or } B) \equiv P(A\cup B) & P(A \text{ and } B) \equiv P(A \cap B) & {P(A\cap B) \over P(B)} \equiv P(A|B)\\ \hline \end{array} \]


Addition rule for mutual exclusivity


Two events \(A\) and \(B\) are mutually exclusive if they do not share any common outcomes


Typically:

\[P(A\cup B)=P(A)+P(B)-P(A\cap B)\]


Given:

\[A \text{ and } B \Rightarrow \text{ Mutually Exclusive}\]


\[P(A \text{ or } B) = P(A) + P(B)\]


\[P(A \cup B) = P(A) + P(B)\]



Multiplication Rules

Def:

\[ P(A|B) = \frac{P(A \cap B)}{P(B)}, \]


Thus:

\[P(A \cap B) = P(A|B) P(B)\]



Events \(A\) and \(B\) are independent if the outcome of \(A\) does not affect the outcome of \(B\) and vice versa

\(A\) and \(B\) are independent if one of the following is true:


\[P(A|B) = P(A)\]


\[P(B|A) = P(B)\]


\[P(A \cap B) = P(A)P(B)\]


Given:

\[A \text{ and } B \Rightarrow \text{ Independent}\]


\[P(A \cap B) = P(A)P(B)\]




Questions?




Random Variables


Random Variable Fundamentals

Random variable (shorthand: r.v.):


  • A rule for assigning a numerical value to each outcome of a random experiment

  • General convention is to use a capital letter toward the end of the alphabet to notate these

    • i.e. \(X,Y,Z\)



Flip a fair coin \(3\) independent times


Let r.v. \(X=\{\text{the number of tails observed}\}\)


\[S=\{HHH,HHT,HTH,THH,HTT,THT,TTH,TTT\}\]


\[ \begin{array}{|c|c|c|c|c|c|} \hline \textbf{Flip 1} & \textbf{Flip 2} & \textbf{Flip 3} \\ \hline \text{H} & \text{H} & \text{H} \\ \hline \text{H} & \text{H} & \text{T} \\ \hline \text{H} & \text{T} & \text{H} \\ \hline \text{T} & \text{H} & \text{H} \\ \hline \text{H} & \text{T} & \text{T} \\ \hline \text{T} & \text{H} & \text{T} \\ \hline \text{T} & \text{T} & \text{H} \\ \hline \text{T} & \text{T} & \text{T} \\ \hline \end{array} \]



The r.v. has its own sample space or support


Support: The set of possible values a r.v. can be


  • The support of \(X\) is \(S_X=\{0,1,2,3\}\)


Notation clarification - \(S\) generally refers to a sample space, while \(S_N\) where \(N \rightarrow \text{some random variable}\) refers to the support of \(N\)



Use a lowercase letter to represent the observed value of the r.v. so that:


\[x=\text{the value after the experiement has been performed (not random)}\]


\[X=\text{the value before the experiement has been performed (still random)}\]



Why does this notation distinction matter?


Statistics is a complex and unique language


  • A troublesome language for first time learners


  • As you come to understand the language better, the rationale for why we use it begins to make more and more sense


In this case we want to make the following expressions make sense:


\[P(X=x)\quad \text{means the probability that r.v.} \ X \ \text{is equal to possible value} \ x\]


\[P(X>x) \quad \text{means the probability that r.v.} \ X \ \text{is greater than possible value} \ x\]


Random Variable Types

Random variables are generally separated into two disctinct types: Discrete and Continuous


Discrete: The number of possible values in the support is finite or countably infinite


Finite or countably infinite refers to integers or whole numbers



Let \(X=\{\text{The number of students missing from the class roster today}\}\)


  • The support of \(X\), \(S_X=\{0,1,2,...,40\}\) (assuming my class roster is correct)

  • I can’t take on partial values between values \(1\) and \(2\)

  • Hypothetically I can’t have more than \(40\) students missing on any given day

  • This is Finite and discrete random variable



Let \(Y=\{\text{The number of fish in a pond}\}\)


  • The support of \(Y\), \(S_Y=\{0,1,2,...\}\)

  • Can I have half a fish? Technically yes. But half a fish isn’t biologically important to count.

  • If partial counts feel arbitrary then you’re more than likely working with a discrete variable

  • I can hypothetically have infinite fish, but the realized value will always be a whole number



Continuous: The support consists of all numbers in an interval of the real number line


This can be any interval or the entire line

  • There are too many numbers to count (hence: uncountably infinite)



Let \(Z=\{\text{The change in median housing prices from one year to another}\}\)


  • The possible values of \(Z\) are: \(-\infty < Z < \infty\)

  • We all know this is strictly positive and every increasing, but let’s pretend

  • Prices can go down, they can go up, they could just not change

  • And those prices can be any value, partial or whole, negative or positive



Let \(W=\{\text{The proportion of couples receptive to couples therapy}\}\)

  • The possible values of \(W\) are: \(0 \le W \le 1\)

  • But \(W\) can be anything in between \(0\) and \(1\)

  • Say, \(w=0.012843199\) or maybe \(w=0.99999991\)




For each random variable, define whether it is Discrete or Continuous (Bonus: Define the support or possible values of the random variable)


\[X=\{\text{The number that comes up on a die}\}\]


\[Y=\{\text{The height of a randomly chosen college student}\}\]


\[Z=\{\text{The amount of electricity used to light a randomly chosen classroom}\}\]


\[W=\{\text{The number of siblings a randomly chosen person has}\}\]


\[T=\{\text{The length of time it takes to travel from a random classroom to Calvin hall}\}\]




Probability Distributions


The form of a r.v.’s probability distribution depends on whether it is continuous or discrete


For a discrete random variable the probability distribution is often a list of all possible values the r.v. can take and their corresponding probabilities of occurrence


Discrete probability distributions satisfy the following two properties:


\[i. \quad 0 \le P(X=x) \le 1\]

\[ii. \quad \sum_x P(X=x)=1\]




Let \(X = \{\text{the number of customers in a line at a supermarket express checkout counter}\}\). The probability distribution of \(X\) is given as follows:


\[ \begin{array}{|c|c|c|c|c|c|c|} \hline x & 0 & 1 & 2 & 3 & 4 & 5 \\ \hline P(X = x) & 0.4 & 0.2 & 0.15 & 0.1 & 0.1 & 0.05 \\ \hline \end{array} \]


  1. Is this a legitimate probability distribution?



  1. Find the probability that there is no customer in a line at the express checkout.



  1. Find the probability that there is at least one customer in a line at the express checkout.



  1. \(P(2 < X \le 5)\)





For a continuous random variable probabilities are determined via a probability density function or pdf


  • To actually dig into this discussion properly, we need to use Calculus

  • The remainder of today will just focus on discrete concepts



We can draw a histogram so that the area of each bar above a given possible value of a r.v. is equal to it’s probability of occurrence



As with datasets, probability distributions can have shape and measures of center as well as spread


  • What is the shape of this probability distribution?




Discrete r.v. Mean (Expectation)


Mathematicians (and Statisticians) prefer deriving values and writing equations in more general forms

  • Not because we like to confuse people, but because we’re lazy


We want to develop a general form for a measure of center and variability for a discrete probability distribution


Consider the probability distribution of \(X\) to be a population

  • Numerical values that we calculate from a probability distribution are called parameters

  • We would notate the mean of a population with \(\mu\)

    • We can use the same notation for the mean of a probability distribution


Given several random variables to consider, we can denote the mean of r.v. \(X\) as:


\[\mu X \quad \textbf{or} \quad \mathbb{E}(X)\]


For a discrete probability distribution of r.v. \(X\), the mean is given by:


\[\mathbb{E}(X)=\mu=\sum_x xP(X=x)\]


  • We call this “the weighted sum of all probabilities of \(x\)




\(X=\{\text{number of customers in a line at the express checkout counter}\}\)


\[ \begin{array}{|c|c|c|c|c|c|c|} \hline x & 0 & 1 & 2 & 3 & 4 & 5 \\ \hline P(X = x) & 0.4 & 0.2 & 0.15 & 0.1 & 0.1 & 0.05 \\ \hline \end{array} \]


Recalling our definition of the mean of a probability distribution for a discrete r.v.


\[\mathbb{E}(X)=\mu=\sum_x xP(X=x)\]


We get:


\[\mu_X=\sum_x xP(X=x)\] \[=0(0.4)+1(0.2)+2(0.15)+3(0.1)+4(0.1)+5(0.05)=1.45\]


So we would say: “over time we expect to have \(1.45\) customers in a line at the express checkout counter”




Interpretation of the Mean of a r.v.

We’ve previously discussed how the mean of a dataset is it’s balance point (fulcrum)

  • Shockingly, the exact same interpretation applies


You can also consider \(\mu\) to be the “average in the long run”


Using the Law of Large Numbers:

  • As we produce more observations and take their averages (\(\bar{x}\))

  • We should approach/converge on the value of \(\mu\)




Variance of a Discrete Probability Distribution

As discussed, probability distributions have spread, spread we can measure


Denoting the variance of discrete r.v. \(X\):


\[\sigma^2_X \quad \text{or} \quad \text{Var}(X) \quad \text{or} \quad \mathbb{V}X\]


The general formula for the variance of r.v. \(X\):


\[\sigma^2=\sum_x (x-\mu)^2 P(X=x)\]




\(X=\{\text{number of customers in a line at the express checkout counter}\}\)


\[ \begin{array}{|c|c|c|c|c|c|c|} \hline x & 0 & 1 & 2 & 3 & 4 & 5 \\ \hline P(X = x) & 0.4 & 0.2 & 0.15 & 0.1 & 0.1 & 0.05 \\ \hline \end{array} \]


Recalling our definition of variance:


\[\sigma^2=\sum_x (x-\mu)^2 P(X=x)\]


We get:


\[\sigma^2_X=\sum_x (x-\mu)^2 P(X=x), \ \ \ \text{recall} \ \mu=1.45\]

\[=(0-1.45)^2(0.4)+(1-1.45)^2(0.2)+(2-1.45)^2(0.15)+(3-1.45)^2(0.1)+(4-1.45)^2(0.1)+(5-1.45)^2(0.05)\]

\[=2.4475\]


For standard deviation, defined loosely as the “average” distance from \(\mu\) in the probability distribution:


\[\sigma=\sqrt{\sigma^2}\]


So:

\[\sigma_X=\sqrt{2.4475}\approx 1.564\]




Given the following probability distribution of discrete r.v. \(X\):


\[ \begin{array}{|c|c|c|c|c|} \hline x & 0 & 1 & 2 & 3 \\ \hline P(X = x) & 0.3 & 0.4 & 0.2 & 0.1 \\ \hline \end{array} \]


  1. Is this a proper probability distribution? Why or why not?



  1. Find \(\mathbb{E}(X)\)



  1. Find \(Var(X)\)



  1. Find \(\sigma_X\)





Given the probability histogram for r.v. \(X\):

  • What is it’s shape?




Attendance QOTD


Go away