Pandas Tutorial – Stack(), Unstack() and Melt()

Pandas Tutorial - Stack(), Unstack() and Melt()
Pandas Tutorial - Stack(), Unstack() and Melt()

Introduction

In our machine learning and data science projects e often have to explore pandas dataframes in different ways for extracting more information. For this purpose we will learn about pandas stack(), unstack() and melt() in this tutorial The functions will be explained with the help of syntax and examples. So let us begin the article.

Importing Pandas Library

Initially, we will load the Pandas library.

In [1]:
import pandas as pd
import numpy as np

Pandas Stack : stack()

The stack function of pandas is used for stacking the levels from columns to index.

Syntax

pandas.DataFrame.stack(level,dropna)

level : int, str, list, default – Here the levels from where columns are stacked are provided.

Ad
Deep Learning Specialization on Coursera

dropna : bool, default True – This parameter is used to decide whether null values should be dropped or not.

Example 1: Using Pandas Stack on single level column

In this example, we will be applying stack() function on single level column. We have created a dataframe and then the pandas stack() function is applied over the column.

In [2]:
df_single_level_cols = pd.DataFrame([[7, 9], [11, 13]],
                                     index=['Bugatti', 'Mclaren'],
                                    columns=['weight', 'speed'])
In [3]:
df_single_level_cols
Out[3]:
weight speed
Bugatti 7 9
Mclaren 11 13
In [4]:
df_single_level_cols.stack()
Out[4]:
Bugatti  weight     7
         speed      9
Mclaren  weight    11
         speed     13
dtype: int64

Example 2: Using multi-level columns with pandas stack()

Here multindex dataframe is created and then the stack function is applied.

In [5]:
multicol1 = pd.MultiIndex.from_tuples([('speed', 'kmh'),
                                       ('speed', 'mph')])
In [6]:
df_multi_level_cols1 = pd.DataFrame([[7, 9], [18, 25]],
                                     index=['Aston Martin', 'Bentley'],
                                     columns=multicol1)
In [7]:
df_multi_level_cols1
Out[7]:
speed
kmh mph
Aston Martin 7 9
Bentley 18 25
In [8]:
df_multi_level_cols1.stack()
Out[8]:
speed
Aston Martin kmh 7
mph 9
Bentley kmh 18
mph 25

Pandas Unstack : unstack()

The pandas unstack() function pivots a level of hierarchical index labels.

Syntax

pandas.DataFrame.unstack(level,fill_value)

level : int, str, or list of these, default -1 (last level) – Here the levels of index which are to be unstacked are passed in this parameter.

fill_value : int, str or dict : Using this parameter, we can fill the null values usin this parameter.

Example 1: Using series data with pandas unstack function

In this example, series data is used for applying the unstacking operation.

In [9]:
index = pd.MultiIndex.from_tuples([('red', 'p'), ('red', 'q'),
                                   ('blue', 'p'), ('blue', 'q')])
In [10]:
s = pd.Series(np.arange(6.0,10.0), index=index)

The unstack function has a parameter known as level. We have used this parameter for performing the unstack operation.

In [11]:
s.unstack(level=-1)
Out[11]:
p q
blue 8.0 9.0
red 6.0 7.0
In [12]:
s.unstack(level=0)
Out[12]:
blue red
p 8.0 6.0
q 9.0 7.0

Example 2: Applying unstack() function on dataframe

In this example, pandas unstack() function is applied to dataframe data by using level parameter.

In [13]:
df = s.unstack(level=0)
In [14]:
df.unstack()
Out[14]:
blue  p    8.0
      q    9.0
red   p    6.0
      q    7.0
dtype: float64

Pandas Melt : melt()

Pandas melt() function is used for unpivoting a DataFrame from wide to long format.

Syntax

pandas.DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name=’value’, col_level=None)

id_vars : tuple, list, or ndarray, optional – Here the columns are passed that will be used as identifier values.

value_vars : tuple, list, or ndarray, optional – In this parameter, columns that we want to melt are selected.

var_name : scalar – This is the name used for the variable column.

value_name : scalar,default value – This is the name used for value column

col_level : int or str, optional – If columns are a MultiIndex then use this level to melt.

Example 1: Using single level dataframes for pandas melt() function

In this example, the melt function of pandas is applied on single level dataframes.

In [15]:
df = pd.DataFrame({'A': {0: 'p', 1: 'q', 2: 'r'},
                    'B': {0: 2, 1: 4, 2: 6},
                    'C': {0: 5, 1: 7, 2: 9}})
In [16]:
df
Out[16]:
A B C
0 p 2 5
1 q 4 7
2 r 6 9

In the examples discussed below, we are changing the existing shape of dataframe with the help of pandas melt function.

In [17]:
pd.melt(df, id_vars=['A'], value_vars=['B'])
Out[17]:
A variable value
0 p B 2
1 q B 4
2 r B 6
In [18]:
pd.melt(df, id_vars=['A'], value_vars=['B', 'C'])
Out[18]:
A variable value
0 p B 2
1 q B 4
2 r B 6
3 p C 5
4 q C 7
5 r C 9

Example 2: Using multi-level dataframes for pandas melt function

Using pandas melt() function, we are using multi-level dataframes in this example.

In [19]:
df.columns = [list('PQR'), list('XYZ')]
In [20]:
df
Out[20]:
P Q R
X Y Z
0 p 2 5
1 q 4 7
2 r 6 9
In [21]:
pd.melt(df, col_level=0, id_vars=['P'], value_vars=['Q'])
Out[21]:
P variable value
0 p Q 2
1 q Q 4
2 r Q 6
In [22]:
pd.melt(df, id_vars=[('P', 'X')], value_vars=[('Q', 'Y')])
Out[22]:
(P, X) variable_0 variable_1 value
0 p Q Y 2
1 q Q Y 4
2 r Q Y 6

Conclusion

We have reached the end of this article, in this article we learned about pandas functions that can help in changing the shape of dataframe. We covered pandas functions of stack(), unstack() and melt() along with syntax and examples.

Reference – https://pandas.pydata.org/docs/

Like and Comment section (Community Members)

Create Your ML Profile!

Don't miss out to join exclusive Machine Learning community

Comments

No comments yet