Making Visualizations with Matplotlib and Seaborn
Matplotlib and Seaborn, powerful Python libraries for data visualization
Today, on Day 10 of our AI/ML Journey, we are wrapping up our research on some essential python libraries… From `Working with data`, to `Scientific Computing`, and now, `Visualization`.
We shall explore visualization with focus on Matplotlib and Seaborn. These are powerful libraries for data visualization, which is also a strong basis for any work on data science and machine learning.
What is Matplotlib?
Matplotlib is a Python library used to create static, interactive, and animated visualizations. It’s often used to create plots like line charts, bar charts, histograms, and scatter plots. The library is very flexible, allowing you to customize almost every aspect of a figure, from colors to fonts, axes, and labels.
Why Use Matplotlib?
Versatile: We can create a wide range of plots (line charts, scatter plots, histograms, etc.).
Customizable: We can change nearly every detail of your plots.
Integrates well: Works seamlessly with libraries like NumPy and Pandas.
Easy to export: We can save our visualizations in various formats (PNG, PDF, etc.).
Installing Matplotlib
To get started, you need to install Matplotlib. Use pip
to install the library:
pip install matplotlib
Once installed, you can import it into your Python script like this:
import matplotlib.pyplot as plt
The conventional way to import Matplotlib is to use the alias plt
.
Creating a Simple Line Plot
Let’s start with the most basic plot - a line plot.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]
# Create a line plot
plt.plot(x, y)
# Display the plot
plt.show()
plt.plot(x, y)
: Creates a line plot withx
andy
as the data points.plt.show()
: Displays the plot.
Plot Customization
You can customize the plot with labels, titles, colors, and more.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]
# Create a line plot with customizations
plt.plot(x, y, color='green', linestyle='--', marker='o')
# Add title and labels
plt.title('Sample Line Plot')
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
# Display the plot
plt.show()
color='green'
: Changes the line color to green.linestyle='--'
: Makes the line dashed.marker='o'
: Adds circles at the data points.plt.title()
,plt.xlabel()
,plt.ylabel()
: Adds a title and axis labels.
Other Types of Plots
Bar Chart
import matplotlib.pyplot as plt # Sample data categories = ['A', 'B', 'C', 'D'] values = [4, 7, 1, 8] # Create a bar chart plt.bar(categories, values) # Add title and labels plt.title('Bar Chart Example') plt.xlabel('Categories') plt.ylabel('Values') # Display the plot plt.show()
plt.bar()
: Creates a bar chart with the categories and their values.
Scatter Plot
import matplotlib.pyplot as plt # Sample data x = [1, 2, 3, 4, 5] y = [5, 10, 15, 20, 25] # Create a scatter plot plt.scatter(x, y) # Add title and labels plt.title('Scatter Plot Example') plt.xlabel('X-axis') plt.ylabel('Y-axis') # Display the plot plt.show()
plt.scatter()
: Creates a scatter plot withx
andy
values.
Histogram
import matplotlib.pyplot as plt # Sample data data = [1, 1, 2, 3, 3, 3, 4, 4, 5, 6, 7] # Create a histogram plt.hist(data, bins=5) # Add title and labels plt.title('Histogram Example') plt.xlabel('Data Values') plt.ylabel('Frequency') # Display the plot plt.show()
plt.hist()
: Creates a histogram. Thebins
argument specifies how the data is grouped.
Subplots
You can create multiple plots within a single figure using subplots.
import matplotlib.pyplot as plt # Sample data x = [1, 2, 3, 4, 5] y1 = [10, 20, 25, 30, 40] y2 = [5, 15, 20, 25, 30] # Create subplots (2 rows, 1 column) plt.subplot(2, 1, 1) plt.plot(x, y1) plt.title('Plot 1') plt.subplot(2, 1, 2) plt.plot(x, y2) plt.title('Plot 2') # Display the plots plt.tight_layout() plt.show()
plt.subplot(2, 1, 1)
: Specifies the first subplot in a figure with 2 rows and 1 column.plt.tight_layout()
: Ensures that the subplots don’t overlap.
Saving Your Plot
You can save the plot to a file instead of displaying it on the screen.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]
# Create a plot
plt.plot(x, y)
# Save the plot as a PNG fiale
plt.savefig('my_plot.png')
# Show the plot (optional)
plt.show()
plt.savefig('my_plot.png')
: Saves the plot as a PNG file. You can also save it in formats like PDF, SVG, etc.
Key Takeaways
Matplotlib is perfect for creating basic visualizations like line plots, bar charts, scatter plots, and histograms.
You can customize every aspect of the plot (titles, labels, colors, etc.).
It integrates well with other Python libraries (NumPy, Pandas) and is highly flexible for data visualization.
Matplotlib is a fundamental tool for data visualization in Python, helping you gain insights and communicate your data clearly.
Seaborn
Seaborn is built on top of Matplotlib and makes it easier to create more visually appealing and informative plots. It comes with built-in themes and functions for more advanced plots like heatmaps, violin plots, and pair plots.
Installing Seaborn
To install Seaborn, use the following command:
pip install seaborn
Importing Seaborn
You usually import Seaborn with the alias sns
:
import seaborn as sns
Creating a Simple Plot in Seaborn
Let’s start with a simple line plot, but notice that Seaborn automatically makes it more attractive by default.
import seaborn as sns
# Using the same data
sns.lineplot(x=x, y=y)
plt.title("Line Plot with Seaborn")
plt.show()
We’ve already seen a line plot (No need to show that again)
Seaborn’s Built-in Datasets
Seaborn comes with built-in datasets, making it easy to practice plotting without having to load your own data.
# Load an example dataset
tips = sns.load_dataset("tips")
# Display the first few rows
print(tips.head())
Visualizing Data with Seaborn
Seaborn provides many easy-to-use plot types
Scatter Plot
sns.scatterplot(x="total_bill", y="tip", data=tips) plt.title("Scatter Plot with Seaborn") plt.show()
Bar Plot
sns.barplot(x="day", y="total_bill", data=tips) plt.title("Bar Plot with Seaborn") plt.show()
Seaborn Heatmap
A heatmap is a popular visualization for showing correlations or patterns between different variables.
# Create a correlation matrix
corr = tips.corr()
# Plot the heatmap
sns.heatmap(corr, annot=True, cmap="coolwarm")
plt.title("Heatmap with Seaborn")
plt.show()
annot=True
: Adds numerical values to each cell.cmap="coolwarm"
: Sets the color map.
Example of a heatmap;
It shows the depth of correlation between multiple features(Columns)
Customizing Seaborn Plots
Seaborn offers themes to make your plots more visually appealing:
# Set a theme
sns.set_theme(style="darkgrid")
# Create a plot with the new theme
sns.lineplot(x=x, y=y)
plt.title("Seaborn with Darkgrid Theme")
plt.show()
Comparison between Matplotlib and Seaborn
Matplotlib is more flexible and allows full control over your plots, but it requires more code to achieve the same level of visual appeal. It is great for fully customizable and precise control over your visualizations.
Seaborn is built on top of Matplotlib, simplifying many types of visualizations and adding aesthetic improvements by default. Seaborn makes it easier to create attractive, high-level statistical graphics.
We’ve made some good progress over the past couple weeks. We thoroughly explored and covered the basics of most essentials. From python programming to exploring multiple tools and some essential libraries.
From here on, we’ll delve even deeper… Join us next, as we get started on Mathematics for Machine Learning.
If you're interested in our roadmap, or our upcoming coding sessions, you should join our discord channel and stay tuned to Raven-R for more updates and cool stuff like these.