# Lecture Notes

math121b:04-15

## Concepts

There is nothing you cannot illustrate by drawing two circles.

Sample Space : just a set, usually denoted as $\Omega$. Elements of $\Omega$ are called outcome. Certain nice subsets of $\Omega$ are called events. (In practice, all subsets that you can cook up are 'nice'.)

Probability : given an event $A \In \Omega$, $\P(A)$ is a real number, called the 'probability' that $A$ happens. It needs to satisfy the following conditions

• $\P(A) \in [0, 1]$, and $\P(\Omega) = 1, \P(\emptyset) = 0$
• If $A_1, A_2$ are events, that are mutually exclusive, i.e. $A_1 \cap A_2 = \emptyset$, then $\P(A_1 \cup A_2) = \P(A_1) + \P(A_2)$. More generally, if you are given countably many mutually exclusive events $A_1, A_2, \cdots$, then $\P(\cup_i A_i ) = \sum_i \P(A_i)$.

I will call $\P$ a probability measure on $\Omega$.

Independence : we say two events $A, B$ are independent, if $\P(A \cap B) = \P(A) \P(B)$.

Conditional Probability : suppose we want to know “given that $A$ happens, what is the probability $B$ will happen?” $$\P(A | B) := \frac{ \P(A \cap B) } {\P(B) }$$

The product rule $$\P(A \cap B) = \P(A | B) \P(B) = \P(B | A) \P(A)$$

Bayes Formula Suppose you know $\P(A),\P(B)$ and $\P(B | A)$, then you can know $\P(A | B)$ $$\P(A | B) = \frac{\P(B | A) \P(A)}{\P(B)}.$$

Random Variable A random variable is a (measurable) function $X: \Omega \to \R$.

The distribution of a random variable, is a probability measure on $\R$, such that given an interval $(a,b) \In \R$, we define $$\P_X( (a,b)) = \P( \{ \omega \in \Omega | X(\omega) \in (a,b) \}).$$ This is like we are pushing forward the probability measure on $\Omega$ to $\R$.

Probability Density Suppose $\P$ is a probability measure on $\R$, then sometimes we can find a function $\rho(x)$, such that $$\P( (a,b) ) = \int_a^b \rho(x) dx$$ In this case, we call $\rho(x)$ the density of $\P$ (with respect to $dx$).

This generalizes to more than one variables.

Joint density, conditional density Suppose we have a probability density on $\R^2$, denoted as $\rho_{XY}(x,y)$ with the meaning $$\P( X \in (a,b), Y \in (c,d)) = \int_{x:a}^b \int_{y:c}^d \rho_{XY}(x,y) dx dy.$$ Suppose we know that $Y$ is near $y_0$, we want to know that given this information, how is $X$ distributed, then we have $$\rho_{X|Y}(x|y=y_0) \propto \rho_{XY}(x, y_0)$$ This is a function of $x$, we just need to 'renormalize' it so that the integral of $x$ over $\R$ is $1$. This gives $$\rho_{X|Y}(x|y=y_0) = \frac{ \rho_{XY}(x, y_0)}{ \int_\R \rho_{XY}(x', y_0) dx' }.$$

## How to play the probability game?

First thing first, find out what is $\Omega$ and $\P$.

Be aware when someone say: “let me randomly choose ….”

Here is an interesting example: Bertrand Paradox

Another hard example: let $M$ be a random symmetric matrix of size $N \times N$, where each entry is iid Gausian. Question: how does eigenvalues of this matrix distribute? 