Discrete Math Problems: IX Probability Mass Functions
Probability Mass Functions
- A box contains $N$ balls labeled $1,2,\dots,N$. One ball is selected uniformly at random. Let $X$ denote the label of the selected ball. Find the PMF, CDF, expected value, and variance of $X$.
Since the total sum of all probabilities must equal \(1\) and each label is equally likely to be selected, the uniform PMF of \(X\) written \(P:\{1,\ldots,N\}\to[0,1]\) is defined by: $ P(x)=\frac1N $ for all \(x\in\{1,\ldots,N\}\).
The CDF of \(X\) is \(F_X:\mathbb{R}\to[0,1]\) defined by $ F_X(x)=P(X\le x) $ for all \(x\in\mathbb{R}\). If \(1\le x < N\), then exactly \(\lfloor x \rfloor\) labels satisfy \(X\le x\). Since each label has probability \(\frac1N\), $ F_X(x)=\frac{\lfloor x \rfloor}{N}. $ If \(x<1\), then no labels satisfy \(X\le x\), so $ F_X(x)=0. $ If \(x\ge N\), then all \(N\) labels satisfy \(X\le x\), so $ F_X(x)=1. $ Therefore, the uniform CDF of \(X\) is defined as:
\[ F_X(x)= \begin{cases} 0 & \text{if } x < 1\\[6pt] \dfrac{\lfloor x \rfloor}{N} & \text{if } 1 \le x < N\\[6pt] 1 & \text{if } x \ge N \end{cases} \]
The expected value of \(X\) is $ E[X] = \sum_{x=1}^{N} xP(x). $ Since $ P(x)=\dfrac1N $ for all \(x\in\{1,\dots,N\}\), $ E[X] = \dfrac1N\sum_{x=1}^{N} x. $ Using the triangle number summation $ 1+2+\cdots+N=\dfrac{N(N+1)}{2}, $ one obtains $ E[X] = \dfrac1N\cdot\dfrac{N(N+1)}{2} = \dfrac{N+1}{2}. $
The variance of \(X\) is defined by $ \operatorname{Var}(X) = E[X^2]-(E[X])^2. $ First compute $ E[X^2] = \sum_{x=1}^{N} x^2P(x). $ Since $ P(x)=\dfrac1N $ for all \(x\in\{1,\dots,N\}\), $ E[X^2] = \dfrac1N\sum_{x=1}^{N} x^2. $ Using the square number summation $ \sum_{x=1}^{N} x^2 = \dfrac{N(N+1)(2N+1)}{6}, $ one obtains $ E[X^2] = \dfrac1N\cdot\dfrac{N(N+1)(2N+1)}{6} = \dfrac{(N+1)(2N+1)}{6}. $ Since $ E[X]=\dfrac{N+1}{2}, $ the variance is $ \operatorname{Var}(X) = \dfrac{(N+1)(2N+1)}{6} - \left(\dfrac{N+1}{2}\right)^2. $ Simplifying, $ \operatorname{Var}(X) = \dfrac{(N+1)(2N+1)}{6} - \dfrac{(N+1)^2}{4} = \dfrac{N^2-1}{12}. $
-
A box contains $N$ balls, of which $K$ are red. One ball is selected uniformly at random. Let $X$ be the number of red balls selected. Find the PMF, CDF, expected value, and variance of $X$.
Since exactly one ball is selected, the random variable \(X\) can only take the values \(0\) and \(1\):
$$ X= \begin{cases} 1 & \text{if the selected ball is red}\\[6pt] 0 & \text{otherwise} \end{cases} $$
Since \(K\) of the \(N\) balls are red and every ball is equally likely to be selected, the Bernoulli PMF of \(X\) written \(P:\{0,1\}\to[0,1]\) is defined by: $ P(1)=\dfrac{K}{N} $ and $ P(0)=\dfrac{N-K}{N}. $ Equivalently,
\[ P(x)= \begin{cases} \dfrac{N-K}{N} & \text{if } x=0\\[6pt] \dfrac{K}{N} & \text{if } x=1 \end{cases} \]
The cumulative distribution function (CDF) of \(X\) is \(F_X:\mathbb{R}\to[0,1]\) defined by $ F_X(x)=P(X\le x) $ for all \(x\in\mathbb{R}\). If \(x<0\), then no possible values satisfy \(X\le x\), so $ F_X(x)=0. $ If \(0\le x<1\), then only the value \(X=0\) satisfies \(X\le x\). Since $ P(X=0)=\dfrac{N-K}{N}, $ one has $ F_X(x)=\dfrac{N-K}{N}. $ If \(x\ge1\), then both possible values \(0\) and \(1\) satisfy \(X\le x\), so $ F_X(x)=1. $ Therefore, the Bernoulli CDF of \(X\) is defined as:
\[ F_X(x)= \begin{cases} 0 & \text{if } x < 0\\[6pt] \dfrac{N-K}{N} & \text{if } 0 \le x < 1\\[6pt] 1 & \text{if } x \ge 1 \end{cases} \]
The expected value of \(X\) is $ E[X] = \sum_{x=0}^{1} xP(x). $ Since $ P(0)=\dfrac{N-K}{N} $ and $ P(1)=\dfrac{K}{N}, $ one obtains $ E[X] = 0\cdot\dfrac{N-K}{N} + 1\cdot\dfrac{K}{N} = \dfrac{K}{N}. $
The variance of \(X\) is defined by $ \operatorname{Var}(X) = E[X^2]-(E[X])^2. $ First compute $ E[X^2] = \sum_{x=0}^{1} x^2P(x). $ Since $ 0^2=0 $ and $ 1^2=1, $ one obtains $ E[X^2] = 0^2\cdot\dfrac{N-K}{N} + 1^2\cdot\dfrac{K}{N} = \dfrac{K}{N}. $ Since $ E[X]=\dfrac{K}{N}, $ the variance is $ \operatorname{Var}(X) = \dfrac{K}{N} - \left(\dfrac{K}{N}\right)^2. $ Simplifying, $ \operatorname{Var}(X) = \dfrac{K(N-K)}{N^2}. $
-
A box contains $N$ balls, of which $K$ are red. A ball is selected independently with replacement $n$ times. Let $X$ be the number of red balls selected. Find the PMF, CDF, expected value, and variance of $X$.
Since each selection is performed with replacement, the probability of selecting a red ball remains constant on every trial: $ p=\dfrac{K}{N}. $ Thus, each trial has two possible outcomes:
$$ \begin{cases} \text{red} & \text{with probability } \dfrac{K}{N}\\[6pt] \text{not red} & \text{with probability } \dfrac{N-K}{N} \end{cases} $$
Since the trials are independent and \(X\) counts the number of red balls selected in \(n\) trials, the random variable \(X\) can take the values $ \{0,1,\dots,n\}. $ To obtain exactly \(x\) red balls, one must choose which \(x\) of the \(n\) trials result in red selections. The number of such choices is $ \binom{n}{x}. $ Each such outcome has probability $ \left(\dfrac{K}{N}\right)^x \left(\dfrac{N-K}{N}\right)^{n-x}, $ since the trials are independent. Therefore, the Binomial PMF of \(X\), written \(P:\{0,\dots,n\}\to[0,1]\), is defined by:
\[ P(x)= \begin{cases} \displaystyle \binom{n}{x} \left(\dfrac{K}{N}\right)^x \left(\dfrac{N-K}{N}\right)^{n-x} & \text{if } x\in\{0,1,\dots,n\}\\[12pt] 0 & \text{otherwise} \end{cases} \]
The cumulative distribution function (CDF) of \(X\) is \(F_X:\mathbb{R}\to[0,1]\) defined by $ F_X(x)=P(X\le x) $ for all \(x\in\mathbb{R}\). If \(x<0\), then no possible values satisfy \(X\le x\), so $ F_X(x)=0. $ If \(0\le x<n\), then exactly the integers $ 0,1,\dots,\lfloor x\rfloor $ satisfy \(X\le x\). Therefore, $ F_X(x) = \sum_{k=0}^{\lfloor x\rfloor} \displaystyle\binom{n}{k} \left(\dfrac{K}{N}\right)^k \left(\dfrac{N-K}{N}\right)^{n-k}. $ If \(x\ge n\), then all possible values $ 0,1,\dots,n $ satisfy \(X\le x\), so $ F_X(x)=1. $ Therefore, the Binomial CDF of \(X\) is defined as:
\[ F_X(x)= \begin{cases} 0 & \text{if } x < 0\\[8pt] \displaystyle \sum_{k=0}^{\lfloor x\rfloor} \binom{n}{k} \left(\dfrac{K}{N}\right)^k \left(\dfrac{N-K}{N}\right)^{n-k} & \text{if } 0\le x < n\\[14pt] 1 & \text{if } x\ge n \end{cases} \]
The expected value of \(X\) is $ E[X] = \sum_{x=0}^{n} xP(x). $ Directly evaluating this summation is possible but cumbersome. Instead, observe that each selection is an independent Bernoulli trial with probability $ p=\dfrac{K}{N} $ of selecting a red ball. For each trial \(i\in\{1,\dots,n\}\), define the indicator random variable:
$$ X_i= \begin{cases} 1 & \text{if the \(i\)-th selected ball is red}\\[6pt] 0 & \text{otherwise} \end{cases} $$
Then the total number of red balls selected is $ X=X_1+X_2+\cdots+X_n. $ Since each \(X_i\) is a Bernoulli random variable with success probability $ p=\dfrac{K}{N}, $ one has $ E[X_i]=p=\dfrac{K}{N} $ for every \(i\). Using linearity of expectation, $ E[X] = E[X_1+\cdots+X_n] = E[X_1]+\cdots+E[X_n]. $ Therefore, $ E[X] = \underbrace{\dfrac{K}{N}+\cdots+\dfrac{K}{N}}_{n\text{ times}} = n\cdot\dfrac{K}{N} = \dfrac{nK}{N}. $
The variance of \(X\) is defined by $ \operatorname{Var}(X) = E[X^2]-(E[X])^2. $ Rather than computing \(E[X^2]\) directly from the PMF, it is easier to use the indicator variable representation $ X=X_1+X_2+\cdots+X_n $ for each \(i\in\{1,\dots,n\}\). Each \(X_i\) is a Bernoulli random variable with success probability $ p=\dfrac{K}{N}. $ Since $ P(X_i=1)=p $ and $ P(X_i=0)=1-p, $ one has $ E[X_i]=p. $ Now compute the variance of a single trial: $ \operatorname{Var}(X_i) = E[X_i^2]-(E[X_i])^2. $ Because $ X_i\in\{0,1\}, $ one has $ X_i^2=X_i. $ Therefore, $ E[X_i^2] = E[X_i] = p. $ Substituting into the variance formula gives $ \operatorname{Var}(X_i) = p-p^2 = p(1-p). $
Since the trials are independent, $ \operatorname{Var}(X) = \operatorname{Var}(X_1+\cdots+X_n) = \operatorname{Var}(X_1)+\cdots+\operatorname{Var}(X_n). $ Thus, $ \operatorname{Var}(X) = \underbrace{p(1-p)+\cdots+p(1-p)}_{n\text{ times}} = np(1-p). $ Substituting $ p=\dfrac{K}{N}, $ one obtains $ \operatorname{Var}(X) = n\cdot\dfrac{K}{N}\left(1-\dfrac{K}{N}\right). $ Simplifying, $ \operatorname{Var}(X) = \dfrac{nK(N-K)}{N^2}. $
-
A box contains $N$ balls, of which $K$ are red. Balls are selected independently with replacement until the first red ball appears. Let $X$ be the number of selections required. Find the PMF, CDF, expected value, and variance of $X$.
Since the selections are performed with replacement, the probability of selecting a red ball remains constant on every trial: $ p=\dfrac{K}{N}. $ Thus, each trial has two possible outcomes:
$$ \begin{cases} \text{red} & \text{with probability } \dfrac{K}{N}\\[6pt] \text{not red} & \text{with probability } \dfrac{N-K}{N} \end{cases} $$
The random variable \(X\) denotes the number of selections required until the first red ball appears. Therefore, $ X\in\{1,2,3,\dots\}. $ To obtain $ X=x, $ the first $ x-1 $ selections must all be non-red, and the \(x\)-th selection must be red.
Since the selections are independent, the probability of this event is $ \left(\dfrac{N-K}{N}\right)^{x-1} \left(\dfrac{K}{N}\right). $ Therefore, the geometric PMF of \(X\), written \(P:\mathbb{N}\to[0,1]\), is defined by:
\[ P(x)= \begin{cases} \displaystyle \left(\dfrac{N-K}{N}\right)^{x-1} \left(\dfrac{K}{N}\right) & \text{if } x\in\mathbb{N}\\[12pt] 0 & \text{otherwise} \end{cases} \]
The cumulative distribution function (CDF) of \(X\) is \(F_X:\mathbb{R}\to[0,1]\) defined by $ F_X(x)=P(X\le x) $ for all \(x\in\mathbb{R}\). If \(x<1\), then no possible values satisfy \(X\le x\), so $ F_X(x)=0. $ If \(x\ge1\), then the event $ X\le x $ means that at least one red ball appears within the first $ \lfloor x\rfloor $ selections.
It is easier to compute the complement event that no red balls appear in the first $ \lfloor x\rfloor $ selections. Since each selection is non-red with probability $ \dfrac{N-K}{N}, $ the probability that all first $ \lfloor x\rfloor $ selections are non-red is $ \left(\dfrac{N-K}{N}\right)^{\lfloor x\rfloor}. $
Therefore, $ F_X(x) = 1- \left(\dfrac{N-K}{N}\right)^{\lfloor x\rfloor} $ for all $ x\ge1. $ Thus, the geometric CDF of \(X\) is defined as:
\[ F_X(x)= \begin{cases} 0 & \text{if } x<1\\[10pt] \displaystyle 1- \left(\dfrac{N-K}{N}\right)^{\lfloor x\rfloor} & \text{if } x\ge1 \end{cases} \]
The expected value of \(X\) is $ E[X] = \sum_{x=1}^{\infty} xP(x). $ Since $ P(x) = \left(\dfrac{N-K}{N}\right)^{x-1} \left(\dfrac{K}{N}\right), $ one obtains $ E[X] = \dfrac{K}{N} \sum_{x=1}^{\infty} x \left(\dfrac{N-K}{N}\right)^{x-1}. $ Using the geometric-series identity $ \sum_{x=1}^{\infty} xr^{x-1} = \dfrac{1}{(1-r)^2}$ for $|r|<1$, with $ r=\dfrac{N-K}{N}, $ one obtains $ E[X] = \dfrac{K}{N} \cdot \dfrac{1}{ \left( 1-\dfrac{N-K}{N} \right)^2 }. $ Simplifying, $ 1-\dfrac{N-K}{N} = \dfrac{K}{N}, $ so $ E[X] = \dfrac{K}{N} \cdot \dfrac{1}{ \left(\dfrac{K}{N}\right)^2 }. $ Therefore, $ E[X] = \dfrac{K}{N} \cdot \dfrac{N^2}{K^2} = \dfrac{N}{K}. $
The variance of \(X\) is defined by $ \operatorname{Var}(X) = E[X^2]-(E[X])^2. $ First compute $ E[X^2] = \sum_{x=1}^{\infty} x^2 P(x). $ Since $ P(x)=\left(\frac{N-K}{N}\right)^{x-1}\left(\frac{K}{N}\right), $ we obtain $ E[X^2] = \frac{K}{N} \sum_{x=1}^{\infty} x^2\left(\frac{N-K}{N}\right)^{x-1}. $ Using the geometric-series identity $ \sum_{x=1}^{\infty} x^2 r^{x-1} = \frac{1+r}{(1-r)^3} \quad \text{for } |r|<1, $ with $ r=\frac{N-K}{N}, $ we get $ E[X^2] = \frac{K}{N} \cdot \frac{1+\frac{N-K}{N}}{\left(1-\frac{N-K}{N}\right)^3}. $ Simplifying step by step: $ 1+\frac{N-K}{N} = \frac{2N-K}{N}$ and $1-\frac{N-K}{N} = \frac{K}{N}. $ So $ E[X^2] = \frac{K}{N} \cdot \frac{\frac{2N-K}{N}}{\left(\frac{K}{N}\right)^3}. $ Now simplify: $ E[X^2] = \frac{K}{N} \cdot \frac{2N-K}{N} \cdot \frac{N^3}{K^3} = \frac{(2N-K)N}{K^2}. $ Now compute the variance: $ \operatorname{Var}(X) = E[X^2]-(E[X])^2 = \frac{(2N-K)N}{K^2} - \frac{N^2}{K^2}. $ Factor: $ \operatorname{Var}(X) = \frac{N}{K^2}\bigl((2N-K)-N\bigr). $ Simplify inside: $ (2N-K)-N = N-K. $ Thus $ \operatorname{Var}(X) = \frac{N(N-K)}{K^2}. $
-
A box contains $N$ balls, of which $K$ are red. A sample of $n$ balls is selected without replacement. Let $X$ be the number of red balls selected. Find the PMF, CDF, expectation, and variance of $X$.
Since the sampling is performed without replacement, the trials are dependent and the probability of success changes after each draw. The random variable \(X\) counts the number of red balls in a sample of size \(n\), so $ X \in \{0,1,2,\dots,n\}. $ To obtain \(X=x\), we must choose exactly: \(x\) red balls from the \(K\) red balls and \(n-x\) non-red balls from the \(N-K\) non-red balls. The number of favorable outcomes is $ \binom{K}{x}\binom{N-K}{n-x}, $ and the total number of outcomes is $ \binom{N}{n}. $ Thus, the PMF is $ P(X=x)=\frac{\binom{K}{x}\binom{N-K}{n-x}}{\binom{N}{n}}. $ Hence, the hypergeometric PMF is:
\[ P(x)= \begin{cases} \displaystyle \frac{\binom{K}{x}\binom{N-K}{n-x}}{\binom{N}{n}} & \text{if } x \in \mathbb{Z},\ \max(0,n-(N-K)) \le x \le \min(n,K)\\[12pt] 0 & \text{otherwise} \end{cases} \]
If \(n > N-K\), then all \(N-K\) non-red balls are used up and the remaining \(n-(N-K)\) selections must be red, while at the same time \(X\) cannot exceed \(n\) (since only \(n\) balls are drawn) or \(K\) (since only \(K\) red balls exist), so $ X \in \bigl[\max(0,\,n-(N-K)),\,\min(n,K)\bigr]. $
The cumulative distribution function (CDF) of \(X\) is \(F_X:\mathbb{R}\to[0,1]\) defined by $ F_X(x)=P(X\le x) $ for all \(x\in\mathbb{R}\). Since \(X\) counts the number of red balls in a sample of size \(n\) drawn without replacement, it follows a hypergeometric distribution with finite support. If \(x<0\), then no possible values satisfy \(X\le x\), so $ F_X(x)=0. $ If \(x\ge n\), then all possible outcomes satisfy \(X\le x\), so $ F_X(x)=1. $ For \(0 \le x < n\), the event \(X \le x\) means that at most \(\lfloor x \rfloor\) red balls are selected. Therefore we sum the PMF over all feasible values up to \(\lfloor x \rfloor\). Thus, $ F_X(x) = \sum_{k=0}^{\lfloor x\rfloor} \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}. $ Therefore, the hypergeometric CDF of \(X\) is defined as:
\[ F_X(x)= \begin{cases} 0 & \text{if } x<0\\[10pt] \displaystyle \sum_{k=0}^{\lfloor x\rfloor} \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}} & \text{if } 0 \le x < n\\[12pt] 1 & \text{if } x\ge n \end{cases} \]
Directly evaluating the definition of expectation $ E[X]=\sum_{x=0}^{n} xP(x) $ is possible but inefficient. Instead, we use indicator random variables. For each trial \(i\in\{1,\dots,n\}\), define
$$ X_i= \begin{cases} 1 & \text{if the \(i\)-th selected ball is red}\\[6pt] 0 & \text{otherwise} \end{cases} $$
Then the total number of red balls selected is $ X=X_1+X_2+\cdots+X_n. $ Each \(X_i\) is Bernoulli with success probability $ p=\dfrac{K}{N}, $ so $ E[X_i]=p=\dfrac{K}{N}. $ Using linearity of expectation, $ E[X] = E[X_1+\cdots+X_n] = E[X_1]+\cdots+E[X_n]. $ Therefore, $ E[X] = \frac{nK}{N}. $
The variance of \(X\) is defined by $ \operatorname{Var}(X)=E[X^2]-(E[X])^2. $ We use the identity $ X=X_1+\cdots+X_n. $ Each indicator satisfies \(X_i\in\{0,1\}\), so \(X_i^2=X_i\). Hence $ \operatorname{Var}(X_i)=p(1-p), \quad p=\frac{K}{N}. $ Now we expand the variance of a sum:
$$ \operatorname{Var}(X_1+\cdots+X_n) = \sum_{i=1}^n \operatorname{Var}(X_i) + 2\sum_{1\le i< j\le n}\operatorname{Cov}(X_i,X_j) $$
Here we use the definition of covariance: $ \operatorname{Cov}(X_i,X_j) = E[X_iX_j]-E[X_i]E[X_j]. $ Since sampling is without replacement, $ P(X_i=1,X_j=1) = \frac{K}{N}\cdot\frac{K-1}{N-1}, $ so $ E[X_iX_j] = \frac{K}{N}\cdot\frac{K-1}{N-1}. $ Also, $ E[X_i]E[X_j] = \left(\frac{K}{N}\right)^2. $ Thus, $ \operatorname{Cov}(X_i,X_j) = \frac{K}{N}\frac{K-1}{N-1} - \left(\frac{K}{N}\right)^2 = -\frac{K(N-K)}{N^2(N-1)}. $ There are \(\binom{n}{2} = \frac{n(n-1)}{2}\) covariance terms, so $ 2\sum_{1\le i < j\le n}\operatorname{Cov}(X_i,X_j) = n(n-1)\left(-\frac{K(N-K)}{N^2(N-1)}\right). $ Now combine terms: $ \operatorname{Var}(X) = n\frac{K}{N}\left(1-\frac{K}{N}\right) - n(n-1)\frac{K(N-K)}{N^2(N-1)}. $ Factor: $ \operatorname{Var}(X) = n\frac{K}{N}\frac{N-K}{N} \left(1-\frac{n-1}{N-1}\right). $ Simplify: $ 1-\frac{n-1}{N-1} = \frac{N-n}{N-1}. $ Therefore, $ \operatorname{Var}(X) = n\frac{K}{N}\frac{N-K}{N}\frac{N-n}{N-1}. $
Comments
Post a Comment