Introduction
In our machine learning and data science projects e often have to explore pandas dataframes in different ways for extracting more information. For this purpose we will learn about pandas stack(), unstack() and melt() in this tutorial The functions will be explained with the help of syntax and examples. So let us begin the article.
Importing Pandas Library
Initially, we will load the Pandas library.
import pandas as pd
import numpy as np
Pandas Stack : stack()
The stack function of pandas is used for stacking the levels from columns to index.
Syntax
pandas.DataFrame.stack(level,dropna)
level : int, str, list, default – Here the levels from where columns are stacked are provided.
dropna : bool, default True – This parameter is used to decide whether null values should be dropped or not.
Example 1: Using Pandas Stack on single level column
In this example, we will be applying stack() function on single level column. We have created a dataframe and then the pandas stack() function is applied over the column.
df_single_level_cols = pd.DataFrame([[7, 9], [11, 13]],
index=['Bugatti', 'Mclaren'],
columns=['weight', 'speed'])
df_single_level_cols
weight | speed | |
---|---|---|
Bugatti | 7 | 9 |
Mclaren | 11 | 13 |
df_single_level_cols.stack()
Bugatti weight 7 speed 9 Mclaren weight 11 speed 13 dtype: int64
Example 2: Using multi-level columns with pandas stack()
Here multindex dataframe is created and then the stack function is applied.
multicol1 = pd.MultiIndex.from_tuples([('speed', 'kmh'),
('speed', 'mph')])
df_multi_level_cols1 = pd.DataFrame([[7, 9], [18, 25]],
index=['Aston Martin', 'Bentley'],
columns=multicol1)
df_multi_level_cols1
speed | ||
---|---|---|
kmh | mph | |
Aston Martin | 7 | 9 |
Bentley | 18 | 25 |
df_multi_level_cols1.stack()
speed | ||
---|---|---|
Aston Martin | kmh | 7 |
mph | 9 | |
Bentley | kmh | 18 |
mph | 25 |
[adrotate banner=”3″]
Pandas Unstack : unstack()
The pandas unstack() function pivots a level of hierarchical index labels.
Syntax
pandas.DataFrame.unstack(level,fill_value)
level : int, str, or list of these, default -1 (last level) – Here the levels of index which are to be unstacked are passed in this parameter.
fill_value : int, str or dict : Using this parameter, we can fill the null values usin this parameter.
Example 1: Using series data with pandas unstack function
In this example, series data is used for applying the unstacking operation.
index = pd.MultiIndex.from_tuples([('red', 'p'), ('red', 'q'),
('blue', 'p'), ('blue', 'q')])
s = pd.Series(np.arange(6.0,10.0), index=index)
The unstack function has a parameter known as level. We have used this parameter for performing the unstack operation.
s.unstack(level=-1)
p | q | |
---|---|---|
blue | 8.0 | 9.0 |
red | 6.0 | 7.0 |
s.unstack(level=0)
blue | red | |
---|---|---|
p | 8.0 | 6.0 |
q | 9.0 | 7.0 |
Example 2: Applying unstack() function on dataframe
In this example, pandas unstack() function is applied to dataframe data by using level parameter.
df = s.unstack(level=0)
df.unstack()
blue p 8.0 q 9.0 red p 6.0 q 7.0 dtype: float64
Pandas Melt : melt()
Pandas melt() function is used for unpivoting a DataFrame from wide to long format.
Syntax
pandas.DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name=’value’, col_level=None)
id_vars : tuple, list, or ndarray, optional – Here the columns are passed that will be used as identifier values.
value_vars : tuple, list, or ndarray, optional – In this parameter, columns that we want to melt are selected.
var_name : scalar – This is the name used for the variable column.
value_name : scalar,default value – This is the name used for value column
col_level : int or str, optional – If columns are a MultiIndex then use this level to melt.
Example 1: Using single level dataframes for pandas melt() function
In this example, the melt function of pandas is applied on single level dataframes.
df = pd.DataFrame({'A': {0: 'p', 1: 'q', 2: 'r'},
'B': {0: 2, 1: 4, 2: 6},
'C': {0: 5, 1: 7, 2: 9}})
df
A | B | C | |
---|---|---|---|
0 | p | 2 | 5 |
1 | q | 4 | 7 |
2 | r | 6 | 9 |
In the examples discussed below, we are changing the existing shape of dataframe with the help of pandas melt function.
pd.melt(df, id_vars=['A'], value_vars=['B'])
A | variable | value | |
---|---|---|---|
0 | p | B | 2 |
1 | q | B | 4 |
2 | r | B | 6 |
pd.melt(df, id_vars=['A'], value_vars=['B', 'C'])
A | variable | value | |
---|---|---|---|
0 | p | B | 2 |
1 | q | B | 4 |
2 | r | B | 6 |
3 | p | C | 5 |
4 | q | C | 7 |
5 | r | C | 9 |
Example 2: Using multi-level dataframes for pandas melt function
Using pandas melt() function, we are using multi-level dataframes in this example.
df.columns = [list('PQR'), list('XYZ')]
df
P | Q | R | |
---|---|---|---|
X | Y | Z | |
0 | p | 2 | 5 |
1 | q | 4 | 7 |
2 | r | 6 | 9 |
pd.melt(df, col_level=0, id_vars=['P'], value_vars=['Q'])
P | variable | value | |
---|---|---|---|
0 | p | Q | 2 |
1 | q | Q | 4 |
2 | r | Q | 6 |
pd.melt(df, id_vars=[('P', 'X')], value_vars=[('Q', 'Y')])
(P, X) | variable_0 | variable_1 | value | |
---|---|---|---|---|
0 | p | Q | Y | 2 |
1 | q | Q | Y | 4 |
2 | r | Q | Y | 6 |
Conclusion
We have reached the end of this article, in this article we learned about pandas functions that can help in changing the shape of dataframe. We covered pandas functions of stack(), unstack() and melt() along with syntax and examples.
- Also Read – Tutorial – Pandas Drop, Pandas Dropna, Pandas Drop Duplicate
- Also Read – Pandas Visualization Tutorial – Bar Plot, Histogram, Scatter Plot, Pie Chart
- Also Read – Tutorial – Pandas Concat, Pandas Append, Pandas Merge, Pandas Join
- Also Read – Pandas DataFrame Tutorial – Selecting Rows by Value, Iterrows and DataReader
Reference – https://pandas.pydata.org/docs/
-
I am Palash Sharma, an undergraduate student who loves to explore and garner in-depth knowledge in the fields like Artificial Intelligence and Machine Learning. I am captivated by the wonders these fields have produced with their novel implementations. With this, I have a desire to share my knowledge with others in all my capacity.
View all posts