7 Ways to Drop Column in Pandas DataFrame

Introduction

Like any other programming language and framework, there are multiple ways to drop columns in Pandas. In this tutorial we will explore the following methods –

  • df.drop()
  • Del Statement
  • df.pop()
  • DataFrame Slicing
  • List Comprehension
  • df.dropna()
  • Iterative Method

 

1. Using df.drop to Drop Column in Pandas

In this example, we are using df.drop method to drop column in Pandas. We create a sample DataFrame and then use the df.drop method to drop the ‘B’ column. Here axis=1 argument signifies that we are dropping a column.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Drop a single column
df = df.drop('B', axis=1)


print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   A  C
0  1  7
1  2  8
2  3  9

 

2. Using Del Statement to Delete Pandas Column

In this example, we are using del statement to delete column in Pandas. Just like before, we create a sample DataFrame as shown below and then drop column B with del statement.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Delete a single column
del df['B']

print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   A  C
0  1  7
1  2  8
2  3  9

3. Using df.pop() to Drop Pandas Column

In this example, to drop Pandas column we are using df.pop() as shown below. We first create a sample DataFrame and then pop the column B by using df.pop() and save it in another variable del_colum. In output, we print both the modified dataframe and the column that was removed using pop.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Pop a single column
del_column = df.pop('B')

print("\nAfter dropping column")
print(df)

print("\nDeleted column")
print(del_column)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   A  C
0  1  7
1  2  8
2  3  9

Deleted column
0    4
1    5
2    6
Name: B, dtype: int64

 

4. Using DataFrame Slicing to Delete Pandas Column

In this example, we use DataFrame slicing to drop column in Pandas DataFrame by selecting all columns except for column B as shown in the snippet below.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Drop a single column using slicing
df = df[df.columns.difference(['B'])]

print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   A  C
0  1  7
1  2  8
2  3  9

5. Using List Comprehension

List comprehension can be used to select columns you want to keep, effectively dropping the rest. This can also be used to delete columns in Pandas DataFrames.  In this example, we specify to keep only columns A and C and then modify the DataFrame using list comprehension which in effect delete column B.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Keep only selected columns
columns_to_keep = ['A', 'C']
df = df[columns_to_keep]

print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   A  C
0  1  7
1  2  8
2  3  9

6. Using df.dropna() Method

This method df.dropna() can be used to delete those columns that contain NaN or None values. In this example, we use the df.dropna() method with axis=1 to remove columns containing NaN values.

 

import pandas as pd
import numpy as np

# Create a sample DataFrame with NaN values
data = {'A': [1, 2, 3], 'B': [4, np.nan, 6], 'C': [7, 8, np.nan]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Drop columns with NaN values
df = df.dropna(axis=1)

print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A    B    C
0  1  4.0  7.0
1  2  NaN  8.0
2  3  6.0  NaN

After dropping column
   A
0  1
1  2
2  3

 

7. Using Iterative Method

In this approach, we can drop multiple columns iteratively by passing columns to be removed one by one to df.drop() method. As shown in the below example, we drop columns A and B using this approach.

 

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

print("Original DataFrame")
print(df)

# Define columns to drop
columns_to_drop = ['B','A']

# Iteratively drop columns
for col in columns_to_drop:
    df = df.drop(col, axis=1)

print("\nAfter dropping column")
print(df)

Output

Original DataFrame
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

After dropping column
   C
0  7
1  8
2  9

 

Choosing Right Approach to Drop Column in Pandas

In this tutorial, we covered multiple approaches to delete columns in Pandas DataFrame. But the question arises which method should one use? The choice of method depends on your specific use case and coding style preferences. Additionally, it’s important to consider the readability and maintainability of your code when selecting a method.

Nonetheless below are some guidelines for choosing the right approach –

  1. Using df.drop() Method: This method is versatile and allows you to drop columns by specifying their names or labels. It’s a common and widely used approach.
  2. Using del Statement: The del statement is straightforward for dropping a single column by label. It’s concise and may be preferred for simple column removal.
  3. Using df.pop() Method: This method is suitable when you want to both remove a column and retrieve its values in one step. It’s helpful if you need to work with the dropped column separately.
  4. Using List Comprehension: List comprehension is useful when you want to keep specific columns and drop the rest. It is useful to drop multiple columns in Pandas Dataframes.
  5. Using df.dropna Method: This method is useful if you want to drop column that contains NaN or None values.
  6. Using Iterative Method: This method is useful to delete multiple columns of Pandas DataFrame.

 

Reference: Pandas Documentation

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *