Seaborn Histogram Plot using histplot() – Tutorial for Beginners

Introduction

In this article, we will go through the Seaborn Histogram Plot tutorial that will be helpful to visualize data distribution in your data science and machine learning projects. We will cover many examples in this tutorial for creating different types of histogram plots using the Seaborn histplot() function. We will also tell you the significance of different parameters that are used in the Seaborn Histogram function. So let’s start this tutorial.

Seaborn Histogram Plot Tutorial

The histogram is a way to visualize data distribution with the help of one or more variables. Histogram uses bins for observations count.

Syntax of Histogram Function in Seaborn

The following section shows the syntax and parameters of the Seaborn histogram function i.e. histplot() –

seaborn.histplot(data=None, *, x=None, y=None, hue=None, weights=None, stat=’count’, bins=’auto’, binwidth=None, binrange=None, discrete=None, cumulative=False, common_bins=True, common_norm=True, multiple=’layer’, element=’bars’, fill=True, shrink=1, kde=False, kde_kws=None, line_kws=None, thresh=0, pthresh=None, pmax=None, cbar=False, cbar_ax=None, cbar_kws=None, palette=None, hue_order=None, hue_norm=None, color=None, log_scale=None, legend=True, ax=None, kwargs)**

Parameters Information

  • data : pandas.DataFrame, numpy.ndarray, mapping, or sequence – Here we provide the input data for the visualization
  • x, y : vectors or keys in data – Through this parameter, we mention the x and y axes positions.
  • hue : vector or key in data – This parameter helps in mapping of variables to color for plot.
  • weights : vector or key in data – Weights help in understanding the impact of each data point for each bin’s count.
  • stat : {“count”, “frequency”, “density”, “probability”} – These are the four types of statistic method that can be used for computing bin values.
  • bins : str , number, vector, or a pair of such values – It’s the bin parameter used for specifying the number of bins.
  • binwidth : umber or pair of numbers – Here we can set the width of the bin
  • binrange : pair of numbers or a pair of pairs – Through this parameter, the lowest and highest value can be specified for edges.
  • palette: string, list, dict, or matplotlib.colors.Colormap – We can choose the colors for mapping hue semantic.
  • color : matplotlib color – This parameter enables us to choose a single color in case there is no hue mapping.
  • kwargs – These are the keyword arguments

The histplot() returns a matplotlib axes with a plot.

Importing the Library

Now we will import the Seaborn library.

In [1]:
import seaborn as sns

Univariate Distribution Histogram in Seaborn

In this type of histogram, we are assigning a variable to ‘x’ for plotting univariate distributions over the x-axis.

We will be using the in-built datasets of seaborn for visualization purposes. So let’s look at different examples of histograms.

Example 1: Simple Seaborn Histogram Plot (Vertical)

The vertical histogram is the simplest and most common type of histogram you will come across in regular use.

We have loaded the tips dataset using seaborn’s load_dataset function. Now after looking at the initial values with the help of head() function, we will plot a simple histogram.

In [2]:
tips = sns.load_dataset("tips")

tips.head()
Output:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

Apart from the parameters like data and x, we are using the color parameter to specify the color of the histogram

In [3]:
sns.histplot(data=tips, x="total_bill", color="lime")
Seaborn Histogram Plot - Example 1

Example 2: Horizontal Histogram

This example shows how we can plot a horizontal histogram using the histplot() function of Seaborn. Note here that we are passing the value to the y parameter to make the histogram plot horizontal

In [3]:
sns.histplot(data=tips, y="size", color = "green")
Output:
Seaborn Histogram Plot - Example 2

Different Usages of bin

Bin Width is an important parameter for a histogram to visualize it more effectively for better data analysis. In the following examples, we will play with the binwidth parameter of the seaborn histplot function.

Example 3: Using binwidth parameter of Seaborn histplot()

Here in this example, we will specify the bin width which will enable more control over the distribution of the values in the histogram. In this case, binwidth is passed as 5

In [4]:
sns.histplot(data=tips, x="total_bill", binwidth=5, color="purple")
Output:
Seaborn Histogram Plot - Example 3

Example 4: Using bins values in Seaborn histplot()

The second example in this category is the one where we are mentioning the number of bins to be used for placing all the data in it.

Here the data used will be about penguins. Let’s load the data and then use it for the purpose of visualization.

In [5]:
penguins = sns.load_dataset("penguins")

penguins.head()
Output:
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
3 Adelie Torgersen NaN NaN NaN NaN NaN
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
In [6]:
sns.histplot(data=penguins, x="flipper_length_mm", bins=50,color="red")
Output:
Seaborn Histogram - Example 4

Categorizing the bins

The third kind of histogram will showcase how we can categorize the bins based on different sets of variables present. For this purpose, we’ll use the hue parameter of histplot() function.

For this example another dataset is used, it’s titled ‘mpg’.

In [7]:
mpg = sns.load_dataset("mpg")

mpg.head()
Output:
mpg cylinders displacement horsepower weight acceleration model_year origin name
0 18.0 8 307.0 130.0 3504 12.0 70 usa chevrolet chevelle malibu
1 15.0 8 350.0 165.0 3693 11.5 70 usa buick skylark 320
2 18.0 8 318.0 150.0 3436 11.0 70 usa plymouth satellite
3 16.0 8 304.0 150.0 3433 12.0 70 usa amc rebel sst
4 17.0 8 302.0 140.0 3449 10.5 70 usa ford torino

Example 5: Layered Categorization of Histogram Bins using Hue in Seaborn

Here the seaborn histogram is structured in form of layers. As you can see the categorization is done using “cylinders” attribute of the dataset which is passed to hue parameter.

In [8]:
sns.histplot(data=mpg, x="mpg", hue="cylinders")

Output

Seaborn histplot() - Example 5

Example 6: Categorization of Histogram Bins using Hue and Stack

In this example, we are stacking the categories for better visualization. So let’s see how it is displayed. For implementing the stack feature, we can use the multiple parameter of histplot function.

In [9]:
sns.histplot(data=mpg, x="mpg", hue="cylinders", multiple="stack")
Output:
Seaborn histplot() - Example 6

Example 7: Categorization of Histogram Bins using Hue and Step

In this example, we will create the histogram in step form. For this, we have to use the element parameter of the seaborn histplot function where we pass the argument “step”

In [10]:
sns.histplot(mpg, x="mpg", hue="cylinders", element="step")
Output:
Seaborn histplot() - Example 7

Example 8: Polygon Shaped Histogram in Seaborn

This kind of histogram is the one where we can shape the histogram as polygons using the element parameter passing poly as the value.

In [11]:
sns.histplot(mpg, x="mpg", hue="cylinders", element="poly")
Output:
Seaborn histplot() - Example 8

Example 9: Seaborn Histogram for Comparison

The previous examples of histograms showed how we can visualize the distribution of continuous or discrete values. In this example, we’ll look at how categorical values can be visualized in the histogram.

For this example, we use multiple parameter in which dodge value is passed. The shrink parameter is used for either increasing or decreasing the size of histogram bars. The range for this parameter lies between 0 to 1.

Remember lower values result in thin histograms but higher values will produce thicker histogram bars.

In [12]:
sns.histplot(data=tips, x="day", hue="sex", multiple="dodge", shrink=.9)
Output:
Histogram in Seaborn - Example - 9

Bivariate Histogram

This is the second type of histogram that we can build. Here the bivariate histogram uses two different variables and then plots them with the help of the x and y-axis.

Example 10: Simple Bivariate Histogram in Seaborn

In [13]:
sns.histplot(tips, x="total_bill", y="tip", color = "yellow")
Output:
Histogram in Seaborn - Example - 10

Example 11: Bivariate Histogram with Colorbar

This example shows a bivariate histogram with bin values that also contains a color bar to represent the values. For displaying color bar, we will add colormap for the same.

In the below code, we are using planets dataset. We then specify the x and y variables along with the bins, discrete, log_scale parameters. We also specify the cbar parameter to attach the color bar to the plot.

The discrete variable is used for handling the gaps that may arise in the histogram and log_scale parameter is used for setting a log_scale on data axis.

In [14]:
planets = sns.load_dataset("planets")
In [15]:
sns.histplot(
    planets, x="year", y="distance",
    bins=30, discrete=(True, False), log_scale=(False, True),
    cbar=True, cbar_kws=dict(shrink=.75),color="pink"
)
Output:
Histogram in Seaborn - Example - 11

Conclusion

In this article, we went through the Seaborn Histogram Plot tutorial using histplot() function. We saw various types of examples of creating histograms for univariate and multivariate scenarios and also with various types of binning techniques.

Reference: https://seaborn.pydata.org/

 

  • Palash Sharma

    I am Palash Sharma, an undergraduate student who loves to explore and garner in-depth knowledge in the fields like Artificial Intelligence and Machine Learning. I am captivated by the wonders these fields have produced with their novel implementations. With this, I have a desire to share my knowledge with others in all my capacity.

    View all posts

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *