Introduction to Probability - Mathematics for Machine Learning

Essentials of probability, various types of events, theoretical vs. empirical probability, and their applications in Machine Learning.

and

Oct 22, 2024

You are likely here to get a lead on the basics of probability, before you can take on statistics and Machine Learning even further. Probability helps form the backbone of numerous Machine Learning algorithms

In today’s research, we'll lay foundation for probability by covering essential topics, starting with basics, then progressing to more in depth concepts. We'll explore how to measure the likelihood of (of occurrence) of different events and differentiate between various types of events.

We shall look at both theoretical and empirical approaches to probability, learning how to calculate probabilities based on reasoning or actual data. These fundamentals ideas will set us up for more advanced topics like `probability distributions`, `Bayes' Theorem`, and the `Central Limit Theorem`.

Let’s get into it!

Basic Probability Concepts

Definition - Probability is the likelihood of an event occurring. In mathematics, and programming, it could mean a way of quantifying the likelihood of an event occurring.

Mathematically, it ranges from 0 to 1, where:

0 means the event will not occur, and…
1 means the event will certainly occur

Here’s a very usual example, a coin is flipped; the probability of getting Heads (H) is

\(P(H) = \frac{\text{Number of favorable outcomes}}{\text{Total outcomes}} = \frac{1}{2} = 0.5\)

Since we’re now familiar with python, Let’s view this in python implementation.

# Probability of getting heads when flipping a fair coin

favorable_outcomes = 1
total_outcomes = 2
probability_heads = favorable_outcomes / total_outcomes
probability_heads  # Output: 0.5

Events, and it’s Types

Independent Events - Two events are independent if the occurrence of one does not affect the occurrence of the other

Take this example; flipping a coin and rolling a die,
The outcome of the coins flip does not change the outcome of the die roll.

For independent events A and B, the probability of both events occurring is

\( P(A \text{ and } B) = P(A) \times P(B)\)

Now, let’s see a coded-out version of our example in python.

# Example of independent events

P_A = 0.5  # Probability of flipping heads
P_B = 1/6  # Probability of rolling a 3 on a die
P_A_and_B = P_A * P_B  # Probability of both events occurring
P_A_and_B  # Output: 0.083333...

Since these events are independent - (the outcome of one does not affect the other), we find the probability of both occurring by multiplying their individual probabilities

\(P(A \text{ and } B) = P(A) \times P(B) = 0.5 \times \frac{1}{6} = 0.083333\ldots\)

Thus, there is about an 8.33% chance of flipping a coin to get heads and rolling a die to get a 3 simultaneously or at the same time.

Dependent Events - Events are dependent, if the occurrence of one affects the occurrence of the other.

Here’s an obvious example; Taking out a card from a deck of cards and not putting it back, changes the possibilities for the next draw.

Mathematical Formula for calculating the probability of dependent events is …

\(P(A \text{ and } B) = P(A) \times P(B \mid A)\)

# Example of dependent events

P_A = 1/52  # Probability of drawing an Ace first
P_B_given_A = 3/51  # Probability of drawing a King after an Ace

P_A_and_B = P_A * P_B_given_A  # Probability of both events occurring

P_A_and_B  # Output: 0.011764...

We can see the relatively lower possibility. Implying that the initial action, reduced the chances of the next action occurring.

Mutually Exclusive Events - Events are mutually exclusive if they cannot happen at the same time.

An example would be rolling a die; both 3 and 5 (or any two faces) can not be rolled at the same time

For mutually exclusive events A and B:

\(P(A \text{ or } B) = P(A) + P(B)\)

# Example of mutually exclusive events

P_A = 1/6  # Probability of rolling a 3
P_B = 1/6  # Probability of rolling a 5

P_A_or_B = P_A + P_B  # Probability of rolling a 3 or a 5

P_A_or_B  # Output: 0.333333...

Theoretical Probability

Now this is no different from what we’ve discussed so far, Theoretical probability is based on reasoning and mathematical calculations, rather than on experiments. It assumes all outcomes are equally likely.

# Theoretical probability of rolling a specific number on a die

total_outcomes = 6
favorable_outcomes = 1
theoretical_probability = favorable_outcomes / total_outcomes

theoretical_probability  # Output: 0.166666...

Empirical Probability

This, is quite unique. It bases probability on observed data from experiments, or historical occurrence. Reflecting how often an event happens, in practice.

If a coin is flipped 100 times and heads is recorded 55 times, the empirical probability of getting heads becomes…

\(P(H) = \frac{\text{Number of heads}}{\text{Total flips}} = \frac{55}{100} = 0.55\)

# Empirical probability from observed data

number_of_heads = 55
total_flips = 100

empirical_probability_heads = number_of_heads / total_flips

empirical_probability_heads  # Output: 0.55

Theoretically, We would have 1 / 2, which would mean a 50% chance of flipping a head. But empirically, since heads was observed 55 times outta a 100, we get an increase in its chances of occurrence.

Let’s take a recap on it all

Basic Probability - Measures the likelihood of an event, ranging from 0 to 1.
Types of Events - Independent, dependent, and mutually exclusive events define how different events relate to each other.
Theoretical Probability - Based on known outcomes and equal likelihood.
Empirical Probability - Based on observed data from experiments.

This has been fun, don’t worry just yet about direct applications of probability in machine learning. We’ll get there in coming days. Keep Learning!

A guest post by

Michael Patrick

Data Science || Machine Learning || Robotic Process Automation

Raven-R’s Substack

Discussion about this post