We have learnt in previous chapter about what a probability function is and some examples of it. In this chapter we are going to focus on numerical quantities associated with the outcomes of experiments.
Random Variable¶
Let's start with taking an example of a board game where we throw two dice and based on the sum of the numbers on the dice the player moves forward that much step. i.e we are not much interested in the direct outcome of the experiment which lead to the sum but rather the sum of dices that we got.
āWhat is the probability that the sum will be 10?
Let's map the outcomes and check.
import itertools
dice_values = list(range(1, 7))  # All Possible values for each die i.e 1, 2, 3...6
result_dict ={}
#Universal Set : i.e {{1,1}, (1,2)....(6,5), (6,6)}
all_throws = set(itertools.product(dice_values, repeat=2))
#Event A : the event that the first die show a 2
outcomes = {(throw, throw[0]+throw[1]) for throw in all_throws}
for out in outcomes:
  if out[1] in result_dict.keys():
    result_dict[out[1]].append(out[0])
  else:
    result_dict[out[1]] = [out[0]]
result_dict = dict(sorted(result_dict.items()))
print("Mapping Outcomes to \u211D", '\n')
print(18*'-', '\u03A9', (41-len('\u03A9'))*'-','\u211D')
for key, value in result_dict.items():
  print(sorted(value), (60-len(str(value)))*"-", key)
Mapping Outcomes to ā ------------------ Ī© ---------------------------------------- ā [(1, 1)] ---------------------------------------------------- 2 [(1, 2), (2, 1)] -------------------------------------------- 3 [(1, 3), (2, 2), (3, 1)] ------------------------------------ 4 [(1, 4), (2, 3), (3, 2), (4, 1)] ---------------------------- 5 [(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)] -------------------- 6 [(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)] ------------ 7 [(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)] -------------------- 8 [(3, 6), (4, 5), (5, 4), (6, 3)] ---------------------------- 9 [(4, 6), (5, 5), (6, 4)] ------------------------------------ 10 [(5, 6), (6, 5)] -------------------------------------------- 11 [(6, 6)] ---------------------------------------------------- 12
So, now our focus is on probability of this numerical quantity (sum of dice) instead of the outcomes of the sample space.
Some points to remember :
- Every outcome has been mapped to some real number
- No outcome is mapped to more than 1 real number.
- There are numbers which are mapped to multiple outcomes.
Thus we have mapped the outcomes to some numerical quantity. $$Ī© ---> \mathbb{R} $$
Our original question asks : What is the probability that the sum will be 10?
i.e we are not asking questions about the outcomes but rather about the numerical quantity associated with it (sum=10).
Let's take another example :
Experiment : Randomly select an employee
$\Omega$ : All employees of the organisation
$\mathbb{R}$ : Number of years of experience, number of projects completed, salary etc.
Question of Interest :
What is the probability that an employee has salary more than 50k?
What is the salary the employee has completed 10 projects?
ā A random variable is a function from a set of possible outcomes to the set of real numbers.
$š_{function (R.V)} : Ī©_{domain} --> ā_{range (subset \: of \: Real \: number)}$
ā Multiple functions (Random Variable) are possible for the given domain (sample space)
Notation :
Unlike normal functions eg. $X(n_1, n_@)$ , we don't write brackets and arguments to denote a Random variable (X).
ā What are the values that a random varibale can take?
Types of Random Variables:
Discrete (finite or countably finite)
The sum of the numbers on two dice
The outcome of a single die
The number of children that an employee has?
The number of cars in an image
Continuous
The amount of rainfall in Pune
The temperature of a surface
The density of a liquid
The haemoglobin level of a patient
Probability Distribution¶
ā What are the probabilities of the values that the random variable can take?
Let's take our previous example of throwing two dice, then the values that the Random variable can take is the sum of dice i.e (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12).
$š_{function (R.V)} : Ī©_{domain} --> ā_{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}$
Our question of interest : What is the probability that the value of the random variable will be x (let say 6)?
$P(X = x(x = 6)) --> [0, 1]$
ā
 An assignment of probabilities to all possible values that a discrete Random Variable (X) can take is called the distribution of the discrete random variable.
from fractions import Fraction
dice_values = list(range(1, 7))  # All Possible values for each die i.e 1, 2, 3...6
result_dict ={}
#Universal Set : i.e {{1,1}, (1,2)....(6,5), (6,6)}
all_throws = set(itertools.product(dice_values, repeat=2))
total_outcome = len(all_throws)
#Event A : the event that the first die show a 2
outcomes = {(throw, throw[0]+throw[1]) for throw in all_throws}
for out in outcomes:
  if out[1] in result_dict.keys():
    result_dict[out[1]].append(out[0])
  else:
    result_dict[out[1]] = [out[0]]
result_dict = dict(sorted(result_dict.items()))
print("Assigning Probabilities", '\n')
print(18*'-', 'Event : X = x', (37-len('Event : X = x'))*'-','x', '-', 'P(X = x)')
for key, value in result_dict.items():
  print(sorted(value), (54-len(str(value)))*"-","|", key, ' |', len(value), '/', total_outcome)
Assigning Probabilities ------------------ Event : X = x ------------------------ x - P(X = x) [(1, 1)] ---------------------------------------------- | 2 | 1 / 36 [(1, 2), (2, 1)] -------------------------------------- | 3 | 2 / 36 [(1, 3), (2, 2), (3, 1)] ------------------------------ | 4 | 3 / 36 [(1, 4), (2, 3), (3, 2), (4, 1)] ---------------------- | 5 | 4 / 36 [(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)] -------------- | 6 | 5 / 36 [(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)] ------ | 7 | 6 / 36 [(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)] -------------- | 8 | 5 / 36 [(3, 6), (4, 5), (5, 4), (6, 3)] ---------------------- | 9 | 4 / 36 [(4, 6), (5, 5), (6, 4)] ------------------------------ | 10 | 3 / 36 [(5, 6), (6, 5)] -------------------------------------- | 11 | 2 / 36 [(6, 6)] ---------------------------------------------- | 12 | 1 / 36
Thus the probability that the random Variable (X) will take value (x = 6) is 6/36 or 1/6.
Thus we could also write it as :
$p_{X}(x)_{\{probability \: dist. \: of \: R.V. \: X\}} = P(X = x) $
$ = P(w \:ε \: Ω : X(w) = x)_{\{The \: collection \: of \: all \: outcomes \: for \: which \: the \: R.V. \: gives \: output \: as \: x\}}$
Probability Mass Function¶
$ Thus \: p_{X}(x) \: is \: the \: Probability \: Mass \: function \: of \: Random \: Variable \: X. \:$
It is also called as
Probability Mass Function (PMF)
Probability Distribution
Distribution
import matplotlib.pyplot as plt
import numpy as np
def pmf_two_dice():
  """Calculates the PMF of the sum of two dice rolls."""
  outcomes = np.array([(i, j) for i in range(1, 7) for j in range(1, 7)])
  sums = outcomes.sum(axis=1)
  pmf = np.unique(sums, return_counts=True)[1] / 36  # Normalize by total outcomes
  return pmf
# Get the PMF and possible sums
pmf = pmf_two_dice()
sums = np.arange(2, 13)
# Create the plot with increased height
fig, ax = plt.subplots(figsize=(8, 5))  # Adjust figsize for desired height
# Create the plot
plt.bar(sums, pmf, color='blue', alpha=0.7)
# Add probability values above bars as fractions
for i, p in enumerate(pmf):
    y_pos = p + 0.001  # Adjust position slightly above bar
    fraction = f"{int(p * 36)}/{36}"  # Convert decimal to fraction
    plt.text(i + 1.9, y_pos, fraction, ha='center', va='bottom')
# Labels and formatting
plt.xlabel("Sum of Dice Rolls (x)")
plt.ylabel("Probability (P(X=x))")
plt.title("PMF of the Sum of Two Dice Rolls")
plt.grid(True)
plt.xticks(sums)  # Ensure all possible sums are shown on the x-axis
# Show the plot
plt.show()
Properties of a PMF¶
what are the requirements for a discrete probability distribution?
A probability mass function (PMF) has several key properties that define its validity and usefulness in representing a discrete probability distribution. Let's look into these properties:
- Non-negativity: Each probability associated with an outcome must be non-negative (P(X = x) ā„ 0 for all x in the sample space). This means the probability of any event happening can't be less than zero. 
- Summation to 1: The sum of the probabilities for all possible outcomes in the sample space must be equal to 1. Mathematically, ΣP(X = x) for all x in the sample space = 1. This ensures that all possibilities are covered and the probabilities add up to a certainty. In other words The sum of the probabilities of disjoint events which partition Ω is 1. 
- Specificity: For each individual outcome, the probability must be clearly defined. There's no ambiguity about the probability associated with any specific event in the sample space. 
- Completeness: For every possible outcome in the sample space, a corresponding probability is defined. No outcome is left unassigned or unaccounted for within the PMF. 
- Uniqueness: The probability assigned to each outcome should be unique and different from other probabilities in the PMF. This ensures clear representation of the relative likelihoods of different events. 
Is there any other compact way to write pmf instead of writing the tables of probability?
Another way is to use below function
$p_{X}(x) = (1 - p)^(x - 1).p$
We will learn more about it in next chapter. for now the key point is : it is desirable to have the entire distribution be specifid by one or few parameters.
In ML, we have deal with functions with parameters thst are learned by data and our goal is given such a function, if we give an input it should give a probability. Our Goal is to have such compact function, where entire distribution can be specified by few parametres.
Summary
Random Variables
$š_{function (R.V)} : Ī©_{domain} --> ā_{range (subset \: of \: Real \: number)}$
Distribution of a Random Variable
An assignment of probabilities to all possible values that a discrete R.V. can take.